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5 CLONING AND EXPRESSION OF cDNA FOR HUMAN 

DIHYDROPYRIMIDINE DEHYDROGENASE 

TECHNICAL FIELD OF THE INVENTION 

The present invention relates to methods and compositions for 

detecting deficiencies in dihydropyrimidine dehydrogenase (DPD) levels in 

10 mammals, including humans. The methods and compositions are useful for 

identifying persons who are at risk of a toxic reaction to the commonly 

employed cancer chemotherapy agent 5-fluorouracil. 

BACKGROUND OF THE INVENTION 
15 5-Fluorouracil (5-FU) is commonly used in the treatment of 

cancers, including cancers of the breast, head, neck, and digestive system. 

The efficacy of 5-FU as a cancer treatment varies significantly among patients. 

Clinically significant differences in systemic clearance and systemic exposur of 

5-FU are often observed. IGrem, J.L. In Chabner, B.A. and J.M. Collins (eds.), 
20 Cancer Chemotherapy: Principles and Practice, pp. 180-224, Philadelphia, PA, 

Lippincott, 1990)]. Furthermore, 5-FU treatment is severely toxic to some 

patients, and has even caused death. [Fleming et al. (1993) Eur. J. Cancer 

29A: 740-744; Thyss et al. (1986) Cancer Chemother. Pharmacol. 16: 64-66; 

Santini et al. (1989) Br. J. Cancer 59: 287-290; Goldberg et al. (1988) Br. J. 
25 Cancer 57: 186-189; Trump era/. (1991) J. Clin. Oncol. 9: 2027-2035; Au ef 

al. (1982) Cancer Res. 42: 2930-2937]. 

Patients in whom 5-FU is severely toxic typically have low levels 

of dihydropyrimidine dehydrogenase (DPD) activity [Tuchman et al. (1985) N. 

Engl. J. Med. 313: 245-249; Diasio et al. (1988) J. Clin. Invest. 81: 47-51; 
30 Fleming et al. (1991) Proc. Am. Assoc. Cancer Res. 32: 179; Harris et al. 

(1991) Cancer (Phi/a.) 68: 499-501; Houyau et al. (1993) J. Nat'l. Cancer Inst. 

85: 1602-1603; Lyss et al. (1993) Cancer Invest. 11: 239-240]. 

Dihydropyrimidine dehydrogenase (DPD, EC 1.3.1.2) is the principal enzyme 

involved in the degradation of 5-FU, which acts by inhibiting thymidylate 
35 synthase [Heggie et al. (1987) Cancer Res. 47: 2203-2206; Chabner et al. 
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(1989) In DeVita eta/, (eds.). Cancer - Principles and Practice of Oncology, pp. 
349-395, Philadelphia, PA, Lippincott; Diasio eta/. (1989) Clin. Pharmacokinet 
16: 215-237; Grem et al.. supra.). The level of DPD activity also affects th 
efficacy of 5-FU treatments, as 5-FU plasma levels are inversely correlat d with 
the level of DPD activity (ligo et a/. (1988) Biochem. Pharm. 37: 1609-1613; 
Goldberg et al.. supra.; Harris at al., supra.; Fleming et al.. supra.). In turn, the 
efficacy of 5-FU treatment of cancer is correlated with plasma levels of 5-FU. 

In addition to its 5-FU degrading activity, DPD is also the initial and 
rate limiting enzyme in the three-step pathway of uracil and thymine 
catabolism, leading to the formation of ft-alanine and 0-aminobutyric acid, 
respectively [Wasternack et al. (1 980) Pharm. Ther. 8: 629-665] DPD deficiency is 
associated with inherited disorders of pynm.dine metabolism, clinically term d 
thymine-urac.luria IBakkeren et al. (1984) Clin. Chim. Acta. 140: 247-256). Clinical 
symptoms of DPD deficiency include a nonspecific cerebral dysfunction, and DPD 
deficiency is associated with psychomotor retardation, convulsions, and epileptic 
conditions [Berger et al. (1984) Clin. Chim. Acta 141: 227-234; Wadman et at. 
(1985) Adv. Exp. Med. Biol. 165A: 109-1 14; Wilcken et al. (1985) J. Inherit. 
Metab. Dis. 8 (Suppl. 2): 115-11 6; van Gennip et al. (1 989) Adv. Exp. Med. Biol. 
253A: 111-118; Brockstedt et al. (1990) J. Inherit. Metab. Dis. 12: 121-124; 
Duran era/. (1991, j. inherit. Metab. Dis. 14: 367-370]. Biochemically, patients 
having DPD deficiency have an almost complete absence of DPD activity in 
fibroblasts (Bakkeren et al.. supra.] and in lymphocytes IBerger et al.. supra.; Piper 
et al. (1 980, Biochim. Biophys. Acta 633: 400-409]. These patients typically have 
a large accumulation of uracil and thymine in their cerebrospinal fluid IBakkeren et 
al.. supra.] and urine IBerger et a/., supra.. Bakkeren et a/., supra.; Brockstedt et 
al.. supra.; Fleming et al. (1992) Cancer Res. 52: 2899-2902]. 

Familial studies suggest that DPD deficiency follows an autosomal 
recessive pattern of inheritance IDiasio et al.. (1 988) supra.]. Up to three percent 
of the general human population are estimated to be putative heterozygotes for 
DPD deficiency, as determined by enzymatic activity in lymphocytes IMilano and 
Eteinne (1994) Pharmacogenetics (in press)]. This suggests that the frequency of 
hombzygotes for DPD deficiency may be as high as one person per thousand. 

DPD has been purified from liver tissue of rats IShiotani and Weber 
(1981) J. Biol. Chem. 256: 219-224; Fujimoto et al. (1991);./. Nutr. Sci. Vitaminol. 
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37: 89-98), pig [Podschun et at. (1989) Eur. J. Biochem. 185: 219-2241, cattle 
(Porter et al. (1 991 ) J. Biol. Chem. 266: 1 9988-1 9994), and human [Lu et al. 
(1992) J. Biol. Chem. 267: 1702-17091. The pig enzyme contains flavins and sron- 
sulfur prosthetic groups and exists as a homodimer with a monomer Mr of about 
5 107,000 [Podschun et a/., supra.]. Since the enzyme exhibits a nonclassical two- 
site ping-pong mechanism, it appears to have distinct binding sites for 
NADPH/NADP and uracil/5, 6-dihydrouracil [Podschun et al. (1990) J. Biol. Chem. 
265: 12966-1 29721. An acid-base catalytic mechanism has been proposed for 
DPD [Podschun et al. (1993) J. Biol. Chem. 268: 3407-34131. 
10 Because an undetected DPD deficiency poses a significant danger to a 

cancer patient who is being treated with 5-FU, a great need exists for a simple and 
accurate test for DPD deficiency. Such a test will also facilitate diagnosis of 
disorders that are associated with DPD deficiency, such as uraciluria. The present 
invention provides such a test, thus fulfilling these and other needs. 

15 

SUMMARY OF THE INVENTION 
The claimed invention includes isolated nucleic acids that code for a 
dihydropyrimidine dehydrogenase (DPD) protein. Human and pig DPD cDNA 

20 sequences are claimed (Seq. ID No. 1 and Seq. ID No. 3, respectively) , as are DPD 
nucleic acids that are capable of selectively hybridizing to the human or pig DPD 
cDNAs under stringent hybridization conditions. Oligonucleotide probes that are 
capable of selectively hybridizing, under stringent hybridizing conditions, to a 
human or pig DPD nucleic acid are also claimed. The invention also includes 

25 isolated nucleic acids that code, for a DPD polypeptide that specifically binds to an 
antibody generated against an immunogen consisting of a human or pig DPD 
polypeptide having an amino acid sequence as depicted by Seq. ID No. 2 or Seq. ID 
No. 4. 

Also claimed are methods for determining whether a patient is at risk 
30 of a toxic reaction to 5-fluorouracil (5-FU). The methods involve analyzing DPD 

DNA or mRNA in a sample from the patient to determine the amount of intact DPD 
nucleic acid. An enhanced risk of a toxic reaction to 5-fluorouracil is indicated by a 
decrease in the amount of intact DPD DNA or mRNA in the sample compared to the 
amount of DPD DNA or mRNA in a sample obtained from a patient known to not 
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have a DPD deficiency, or by a defect in the DPD nucleic acid that results in an 

inadequate level of DPD activity. 

The invention also includes methods for expressing recombinant DPD 

protein in a prokaryot.c cell. The methods involve transfecting the cell with an 
expression vector comprising a promoter that is operably linked to a nucleic acid 
that encodes DPD. and incubating the cell in a medium that contains uracil to allow 
expression of the recombinant DPD protein. 

Also claimed are expression vectors that utilize a nucleic acid that 
encodes DPD as a selectable marker. These selectable markers function in both 
eukaryotes and prokaryotes. 



cDNA. 



BRIEF DESCRIPTION OF THE FIGURES 
Figures 1 A-1 B show the nucleotide sequence of the human DPYD 



Figures 2A-2B shows the nucleotide sequence of the pig DPYD cDNA. 
Figure 3 shows a comparison of the pig and human DPD cDNA 
deduced amino acid sequences. Only those amino acid residues of human DPD 
that differ from the pig sequences are shown below the pig DPD amino acid 
sequence. The following motifs relevant for catalytic activity are boxed: 
NADPH/NADP binding. FAD binding, uracil binding, and 4Fe-4S binding. 

Figure 4 shows the pedigree of a family used for a study of 
inheritance of DPD deficiency. Symbols are as follows: □ male, O female. Dotted 
symbols indicate intermediate DPD activity, a dashed square indicates high (normal) 
DPD activity, and ■ indicates undetectable DPD activity. 

Figure 5 shows a Southern blot of the products from reverse 
transcriptase PCR amplified cDNA for the subjects shown in Figure 4. The 906 and 
741 bp bands correspond to the wild-type and the deleted DPD cDNA fragments, 
respectively. • + - signifies the presence of the wild-type allele and •-- signifies 
the presence of the mutant allele. 

Figure 6 is a schematic of the wild-type and mutant DPD cDNAs. 
Numbers above the cDNA graphical representation represent nucleotide positions. 
Start and stop codons are indicated. 

Figure 7 is a PCR analysis of the DPD cDNA deletion found in the 
subject family. The numbers of the subjects correspond to those indicated in Figure 
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4. Lane 6 is a negative control (no template present) and Lane 7 contains a 1 kb 
marker ladder (GIBCO BRL). 



5 DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

Definitions 

Abbreviations for the twenty naturally occurring amino acids follow 
conventional usage. In the polypeptide notation used herein, the left-hand direction 
is the amino terminal direction and the right-hand direction is the carboxy-terminal 

10 direction, in accordance with standard usage and convention. 

The term rt nucleic acids. " as used herein, refers to either DNA or 
RNA. Included are single or double-stranded polymers of deoxyribonucieotide or 
ribonucleotide bases. Self-replicating plasmids, infectious polymers of DNA or RNA 
and nonfunctional DNA or RNA are included. Unless specified otherwise, the left 

15 hand end of single-stranded polynucleotide sequences is the 5' end. The direction 
of 5' to 3' addition of ribonucleotides to nascent RNA transcripts is referred to as 
the transcription direction; sequence regions on the DNA strand having the same 
sequence as the RNA and which are 5' to the 5' end of the RNA transcript are 
referred to as "upstream sequences;" sequence regions on the DNA strand having 

20 the same sequence as the RNA and which are 3' to the 3' end of the RNA 
transcript are referred to as "downstream sequences." 

"Nucleic acid probes" or "oligonucleotide probes" can be DNA or 
RNA fragments. Where a specific sequence for a nucleic acid probe is given, it is 
understood that the complementary strand is also identified and included. The 

25 complementary strand will work equally well in situations where the target is a 
double-stranded nucleic acid. 

The phrase "selectively hybridizing to" refers to a nucleic acid probe 
that, under appropriate hybridization conditions, hybridizes, duplexes or binds only 
to a particular target DNA or RNA sequence when the target sequences are present 

30 in a preparation of DNA or RNA. "Complementary" or "target" nucleic acid 

sequences refer to those nucleic acids that selectively hybridize to a nucleic acid 
probe. Proper annealing conditions depend, for example, upon a probe's length, 
base composition, and the number of mismatches and their position on the probe, 
and must often be determined empirically. For discussions of nucleic acid probe 
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design and annealing conditions, see, for example. Sambrook et al.. Molecular 
Cloning: A Laboratory Manual (2nd ed.J. Vols. 1-3, Cold Spring Harbor Laboratory, 
(1 989) or Current Protocols in Molecular Biology. F. Ausubel et at., (ed.) Greene 
Publishing and Wiley-lnterscience, New York (1987). 

The terms "stringent conditions" and "conditions of high stringency- 
refer to conditions under which a nucleic acid probe will hybridize substantially to 
its target subsequence, but to no other sequences. Stringent conditions are 
sequence-dependent and will be different in different circumstances. Longer 
sequences hybridize specifically at higher temperatures. Generally, stringent 
conditions are selected to be about 5°C lower than the thermal me.ting point (Tm) 
for the specific sequence at a defined ionic strength and pH. The Tm is the 
temperature (under defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a complementary probe. Typically, str.ngent conditions will 
be those in waich the salt concentration ,s at least about 0.2 molar at pH 7 and the 
temperature is at least about 60 "C for long sequences (e.g. greater than about 50 
nucleotides) and at least about 42°C for shorter sequences (e.g. 10 to 50 
nucleotides). As other factors may significantly affect the stringency of 
hybridization, including, among others, base composition and size of the 
complementary strands, the presence of organic solvents and the extent of base 
mismatching, the combination of parameters is more important than the absolute 
measure of any one. 

A nucleic acid is said to "encode" or "code for" a specific protein 
when the nucleic acid sequence comprises, in the proper order, codons for each of 
the amino acids of the protein or a specific subsequence of the protein. The nucleic 
acids include both the DNA stra.nd that is transcribed into RNA and the RNA strand 
that is translated into protein. It is further understood that the invention includes 
nucleic acids that differ from the DPD sequences specifically disclosed herein in 
that particular codons are replaced by degenerate codons, so that the variant 
nucleic acid encodes a protein having the same amino acid sequence as that 
encoded by the specifically disclosed nucleic acids. 

The phrase "isolated" or "substantially pure." when referring to 
nucleic acids that encode DPD. refers to nucleic acids that are sufficiently pure that 
the predominant nucleic acid species in the preparation is the desired DPD nucleic 
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acid. Preferably, The DPD nucleic acids are more than 70% pure, more preferably 
greater than 90% pure, and most preferably greater than 95% pure. 

The term "control sequence" refers to a DNA sequence or sequences 
that are capable, when properly attached .to a desired coding sequence, of causing 
5 expression of the coding sequence. Such control sequences include at least 

promoters and, optionally, transcription termination signals. Additional factors 
necessary or helpful for expression can also be included. As used herein, "control 
sequences" simply refers to whatever DNA sequence signal that is useful to result 
in expression in the particular host used. Often, control sequences are utilized as 
10 an "expression cassette," in which the control sequences are operably linked to 
the nucleic acid that is to be expressed. 

The term "operably linked" as used herein refers to a juxtaposition 
wherein the components are configured so as to perform their usual function. 
Thus, control sequences or promoters operably linked to a coding sequence are 
1 5 capable of effecting the expression of the coding sequence. 

The term "vector" refers to nucleic acids that are capable of 
replicating in a selected host organism. The vector can replicate as an autonomous 
structure, or alternatively can integrate into the host cell chromosome(s) and thus 
replicate along with the host cell genome. Vectors include viral- or bacteriophage- 
20 based expression systems, autonomous self-replicating circular DNA <plasmids), and 
include both expression and nonexpression vectors. The term "plasmid" refers to 
an autonomous circular DNA molecule capable of replication in a cell, and includes 
both the expression and nonexpression types. 

The phrase "recombinant protein" or "recombinantly produced 
25 protein" refers to a peptide or protein produced using recombinant DNA techniques. 
Host cells produce the recombinant protein because they have been genetically 
altered by the introduction of the appropriate nucleic acid that codes for the protein. 
Typically, the heterologous nucleic acid is introduced as part of an expression 
vector. 

30 The following terms are used to describe the sequence relationships 

between two or more nucleic acids or polynucleotides: "reference sequence", 
"comparison window", "sequence identity", "percentage of sequence identity", and 
"substantial identity". A "reference sequence* is a defined sequence used as a 
basis for a sequence comparison; a reference sequence can comprise a complete 
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cDNA or gene sequence, such as the nucleic acid sequence of Seq. ID Nos. 1 or 3, 
or can be a subset of a larger sequence, for example, as a segment of a full-length 
cDNA or gene sequence. 

Optimal alignment of sequences for aligning a comparison window can 
be conducted by the local homology algorithm of Smith and Waterman (1 981 J Adv. 
Appl. Math. 2:482, by the homology alignment algorithm of Needleman and 
Wunsch (1970) J. MoL Biol. 48:443, by the search for similarity method of Pearson 
and Lipman (1988) Proc. Natl. Acad. ScL (USA) 85:2444, or by computerized 
implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in 
the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 
575 Science Dr., Madison, Wl). 

The terms "substantial identity" or "substantial sequence identity" as 
applied to nucleic acids and as used herein denote a characteristic of a nucleotide 
sequence wherein the polynucleotide comprises a sequence that has at least 85 
percent sequence identity, preferably at least 90 to 95 percent sequence identity, 
and more preferably at least 99 percent sequence identity as compared to a 
reference sequence over a comparison window of at least 20 nucleotide positions, 
frequently over a window of at least 25-50 nucleotides. The percentage of 
sequence identity is calculated by comparing the reference sequence to the 
polynucleotide sequence, which may include deletions or additions which total 20 
percent or less of the reference sequence over the window of comparison. The 
reference sequence may be a subset of a larger sequence, such as a segment or 
subsequence of the human DPD gene disclosed herein. 

As applied to polypeptides, the terms "substantial identity" or 
"substantial sequence identity" mean that two peptide sequences, when optimally 
aligned, such as by the programs GAP or BESTFIT using default gap weights, share 
at least 80 percent sequence identity, preferably at least 90 percent sequence 
identity, more preferably at least 95 percent sequence identity or more. 
"Percentage amino acid identity" or "percentage amino acid sequence identity" 
refers to a comparison of the amino acids of two polypeptides which, when 
optimally aligned, have approximately the designated percentage of the same amino 
acids. For example, "95% amino acid identity" refers to a comparison of the amino 
acids of two polypeptides which when optimally aligned have 95% amino acid 
identity. Preferably, residue positions that are not identical differ by conservative 
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amino acid substitutions. For example, the substitution of amino acids having 
similar chemical properties such as charge or polarity are not likely to effect the 
properties of a protein. Examples include glutamine for asparagine or glutamic acid 
for aspartic acid. 

5 The phrase "substantially purified" or "isolated" when referring to a 

DPD polypeptide means a chemical composition that is essentially free of other 
cellular components. The DPD polypeptide is preferably in a homogeneous state, 
although it can be in either a dry form or in an aqueous solution. Purity and 
homogeneity are typically determined using analytical chemistry techniques such as 
10 polyacrylamide gel electrophoresis (PAGE) or high performance liquid 

chromatography (HPLC). A protein that is the predominant species present in a 
preparation is considered substantially purified. Generally, a substantially purified or 
isolated protein will comprise more than 80% of all macromolecular species present 
in the preparation. Preferably, the protein is purified to represent greater than 90% 
1 5 of all macromolecular species present. More preferably the protein is purified to 
greater than 95%, and most preferably the protein is purified to essential 
homogeneity, wherein other macromolecular species are not detected by 
conventional techniques. 

The phrase "specifically binds to an antibody" or "specifically 
20 immunoreactive with." when referring to a protein or peptide, refers to a binding 
reaction that is determinative of the presence of the protein in the presence of a 
heterogeneous population of proteins and other biologies. Thus, under designated 
immunoassay conditions, the specified antibodies bind to a particular protein and do 
not bind in a significant amount to other proteins present in the sample. Obtaining 
25 an antibody that specifically binds to a particular protein may require screening. 
For example, antibodies raised to the human DPD protein immunogen with the 
amino acid sequence depicted in SEQ. ID No. 2 can be selected to obtain antibodies 
specifically immunoreactive with DPD proteins and not with other proteins. These 
antibodies recognize proteins that are homologous to the human DPD protein, such 
30 as DPD proteins from other mammalian species. A variety of immunoassay formats 
can be used to select antibodies specifically immunoreactive with a particular 
protein. For example, solid-phase enzyme-linked immunoassays (ELISAs) are 
routinely used to select monoclonal antibodies specifically immunoreactive with a 
protein. See Harlow and Lane (1 988) Antibodies. A Laboratory Manual. Cold Spring 
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Harbor Publications, New York, for a description of immunoassay formats and 
conditions that can be used to determine specific immunoreactivity. 



Detaile d Description of the Preferred Embodiment 

The claimed invention provides compositions and methods that are 
useful for detecting deficient or diminished DPD activity in mammals, including 
humans. These methods and compositions are useful for identifying people who 
are at risk of a toxic reaction to the chemotherapy agent 5-fiuorouracil. Methods 
and compositions for treating mammals who suffer from an insufficient lev I of DPD 
are also provided. Also included in the invention are methods for expressing high 
levels of DPD in prokaryotes, and selectable markers that function in both 
prokaryotes and eukaryotes. 

The claimed methods and compositions are based on the discovery of 
an isolated cDNA that codes for human dihydropyrimidine dehydrogenase (DPD). A 
newly discovered cDNA that codes for pig DPD is also described. The human (SEQ. 
ID No. 1) and pig (SEQ. ID No. 3) DPD cDNA sequences are presented in Figures 
1 A- IB and 2A-2B. respectively. An alignment of the human and pig DPD deduced 
amino acid sequences is shown in Figure 3. The nucleic acids of the invention ar 
useful for determining whether a patient has an abnormal DPD gene, or whether the 
DPD gene in a patient is expressed an insufficient level. Either of these conditions 
can result in a DPD deficiency that can cause the patient to be susceptible to 5-FU 
toxicity. By detecting the DPD deficiency before treatment commences, the 
clinician can either adjust the dose of 5-FU downward, or can choose an alternative 
chemotherapy agent. 



A. Descriptio n and Isolation of DPD Nucleic Acids 
1 • Description of DPD Nucleic Acids 

The nucleic acids of the invention are typically identical to or show 
substantial sequence identity (determined as described above) to the nucleic acid 
sequences of SEQ ID No. 1 or SEQ ID No. 3. Nucleic acids encoding human DPD 
will typically hybridize to the nucleic acid sequence of SEQ ID Nos. 1 or 3 under 
stringent hybridization conditions as described herein. 
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Also claimed are isolated nucleic acids that code for a DPD 
polypeptide that specifically binds to an antibody generated against a specific 
immunogen, such as an immunogen that has of the amino acid sequence depicted 
by SEQ ID Nos. 2 or 4, or a specific subsequence of these polypeptides. To 
identify whether a nucleic acid encodes such a DPD polypeptide, an immunoassay 
is typically employed. Typically, the immunoassay will use a polyclonal or 
monoclonal antibody that was raised against the protein of SEQ ID Nos. 2 or 4. 
The antibody is selected to have low cross-reactivity against other (non-DPD) 
polypeptides, and any such cross- reactivity is removed by immunoadsorption prior 
to use in the immunoassay. 

In order to produce antisera for use in an immunoassay, the DPD 
protein of SEQ ID Nos. 2 or 4 is isolated as described herein, for example, by 
recombinant expression. An inbred strain of mouse such as Balb/c is immunized 
with the DPD protein using a standard adjuvant, such as Freund's adjuvant, and a 
standard mouse immunization protocol (see Harlow and Lane, supra). Alternatively, 
a synthetic peptide derived from the amino acid sequences disclosed herein and 
conjugated to a carrier protein can be used an immunogen. Polyclonal sera are 
collected and titered against the immunogen protein in an immunoassay, for 
example, a solid phase immunoassay with the immunogen immobilized on a solid 
support. Polyclonal antisera with a titer of 1 0* or greater are selected and tested 
for their cross reactivity against non-DPD proteins, using a competitive binding 
immunoassay such as the one described in Harlow and Lane, supra, at pages 570- 
573. Three non-DPD proteins are used in this determination: the IRK protein [Kubo 
et a/. (1993) Nature 362:1 27] the G-IRK protein (Kubo et a/. (1 993) Nature 
364:802] and the ROM-K protein [Ho et at. (1993) Nature 362:1271. These non- 
DPD proteins can be produced as recombinant proteins and isolated using standard 
molecular biology and protein chemistry techniques as described herein. 

Immunoassays in the competitive binding format can be used for the 
crossreactivity determinations. For example, the DPD protein of SEQ ID Nos. 2 or 4 
can be immobilized to a solid support. Proteins added to the assay compete with 
the binding of the antisera to the immobilized antigen. The ability of the above 
proteins to compete with the binding of the antisera against the immobilized protein 
is compared to the DPD protein of Seq. ID Nos. 2 or 4. The percent crossreactivity 
for the above proteins is calculated, using standard calculations. Those antisera 
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with less than 10% crossreactivity with each of the proteins listed above are 
selected and pooled. The cross-reacting antibodies are then removed from the 
pooled antisera by immunoadsorption with the above-listed proteins. 

The immunoadsorbed and pooled antisera are then used in a 
competitive binding immunoassay as described above to determine whether a 
nucleic acid codes for a DPD polypeptide that specifically binds to an antibody 
generated against human or pig DPD polypeptide of SEQ ID No. 2 or 4, 
respectively. The second protein (the protein encoded by the nucleic acid of 
interest) and the immunogen protein (the human or pig DPD protein of SEQ ID Nos. 
2 or 4) are compared for their ability to inhibit binding of the antiserum to 
immobilized human or pig DPD polypeptide. In order to make this comparison, the 
two proteins are each assayed at a wide range of concentrations to determine the 
amount of each protein required to inhibit the binding of the antisera to the 
immobilized protein by 50%. If the amount of the second protein required is less 
than 10 times the amount of the human DPD protein of SEQ ID No. 2 that is 
required, then the second protein is said to specifically bind to an antibody 
generated to an immunogen consisting of the human DPD protein of SEQ ID No. 2. 
Similarly, the second protein is said to specifically bind to an antibody generated 
against an immunogen consisting of the pig DPD protein of SEQ ID No. 4 if the 
amount of second protein required to block antiserum binding by 50% is ten times 
or less than the amount of pig DPD protein required. 

2. Isolation of DPD Nucleic Acids 

The DPD nucleic acid compositions of this invention, whether cDNA, 
genomic DNA, RNA, or a hybrid of the various combinations, may be isolated from 
natural sources or may be synthesized in vitro. The nucleic acids claimed can be 
present in transformed or transfected whole cells, in a transformed or transfected 
cell lysate, or in a partially purified or substantially pure form. 

Techniques for manipulating the DPD and other nucleic acids, such as 
those techniques used for subcloning the nucleic acids into expression vectors, 
labelling probes, nucleic acid hybridization, and the like are described generally in 
Sambrook et aL, Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989, which is 
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incorporated herein by reference. This manual is hereinafter referred to as 
"Sambrook." 

Various methods for isolating the DPD nucleic acids are available. For 
example, one can isolate DNA from a genomic or cDNA library by using labelled 
5 oligonucleotide probes that have nucleotide sequences that are complementary to 
the human and pig DPD gene sequences disclosed herein (SEQ. ID Nos. 1 and 3. 
respectively). One can use full-length probes or oligonucleotide probes that are 
based on specific subsequences of these genes. Probes are discussed more fully 
below. One can use such probes directly in hybridization assays to identify nucleic 
10 acids that code for DPD. or one^can use amplification methods such as PCR to 
isolate DPD nucleic acids. 

Methods for making and screening cDNA libraries are well known. 
See. e.g.. Gubler, U. and Hoffman, B.J. (1983) Gene 25: 263-269 and 
Sambrook. supra. Briefly, to prepare a cDNA library for the purpose of isolating a 
1 5 DPD cDNA. one isolates mRNA from tissue that expresses DPD. Liver is a 

particularly useful tissue for this purpose, as are peripheral blood lymphocytes. 
Most other cells also likely produce DPD due to its critical role in pyrimidine 
degradation and B-alanine synthesis. cDNA is then prepared from the mRNA using 
standard techniques and ligated into a recombinant vector. The vector is 
20 transfected into a recombinant host for propagation, screening and cloning. 

Methods for preparing genomic libraries are also well known to those 
of skill in the art. See. e.g.. Sambrook. supra. Typically, one can prepare a 
genomic library by extracting DNA from tissue and either mechanically shearing or 
enzymatically digesting the DNA to yield fragments of about 12-20kb, or longer if a 
25 cosmid is used as the cloning vector. Fragments of the desired size are purified by 
density gradient centrifugation or gel electrophoresis. The fragments are then 
cloned into suitable cloning vectors, such as bacteriophage lambda vectors or 
cosmids. If phage or cosmids are used, one then packages the DNA in vitro, as 
described in Sambrook, supra. Recombinant phage or cosmids are analyzed by 
30 plaque hybridization as described in Benton and Davis. (1977) Science 196: 180- 
1 82. Colony hybridization is carried out as generally described in Grunstein et al. 
(1975) Proc. Natl. Acad. Sci. USA. 72: 3961-3965. 

Standard techniques are used to screen the cDNA or genomic DNA 
libraries to identify those vectors that contain a nucleic acid that encodes a human 



WO 96/08568 



PCT/US95/12016 



or mammalian DPD. For example. Southern blots are utilized to identify those 
library members that hybridize to nucleic acid probes derived from the human or pig 
DPD nucleotide sequences shown in Figures 1A-1B and 2A-2B, respectively. See, 
e.g., Sambrook, supra. 
5 Alternatively, one can prepare DPD nucleic acids by using any of 

various methods of amplifying target sequences, such as the polymerase chain 
reaction. For example, one can use polymerase chain reaction (PCR) to amplify 
DPD nucleic acid sequences directly from mRNA, from cDNA or genomic DNA, or 
from genomic DNA libraries or cDNA libraries. Briefly, to use PCR to isolate the 

10 DPD nucleic acids from genomic DNA, one synthesizes oligonucleotide primer pairs 
that are complementary to the 3' sequences that flank the DNA region to be 
amplified. One can select primers to amplify the entire region that codes for a full- 
length DPD polypeptide, or to amplify smaller DNA segments that code for part of 
the DPD polypeptide, as desired. Suitable primer pairs for amplification of the 

1 5 human DPYD gene are shown in Table 1 and are listed as SEQ ID Nos. 5 and 6, 7 
and 8, 9 and 10. Polymerase chain reaction is then carried out using the two 
primers. See, e.g., PCR Protocols: A Guide to Methods and Applications. (Innis, M, 
Gelfand, D., Sninsky, J. and White, T., eds.). Academic Press, San Diego (1990). 
Amplified fragments can be used as hybridization probes to identify other DPD 

20 nucleic acids, such as those from organisms other than human and pig. 

Other methods known to those of skill in the art can also be used to 
isolate DNA encoding the DPD polypeptides. See, e.g., Sambrook, supra., for a 
description of other techniques that are useful for isolating DNA that codes for 
specific polypeptides. 



B. Diagnostic Methods: Detection of DPD Deficiency bv Nucleic Acid Detection 

To permit the clinician to determine whether a patient has diminished 
30 or deficient DPD activity, and thus an enhanced risk of a toxic reaction to 5-FU, the 
present invention provides methods and reagents for detecting DNA and RNA 
molecules that code for DPD. These methods permit one to detect DPD deficiency 
in a patient whether the deficiency is due to a deleted DPD gene {DPYD), a DPD 
gene that is expressed at a lower than normal rate, or a missense or nonsense 
35 mutation that results in an abnormal DPD polypeptide. If any of these tests indicate 
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that the patient has a DPD deficiency, the clinician should exercise extrem caution 
in using 5-FU as a chemotherapy agent. These methods are also suitable for 
diagnosing other disorders that are caused by DPD nucleic acid deficiency, such as 
thymine uraciluria. 

1 • Oligonucleotide Probes 

One aspect of the invention is nucleic acid probes that are useful for 
detecting the presence or absence of DPD nucleic acids in a sample from a human 
or other mammal. Typically, oligonucleotides are used, although longer fragments 
that comprise most or all of a DPD gene are also suitable. The claimed probes are 
specific for human or pig DPD genes. Oligonucleotide probes are generally between 
about 10 and 100 nucleotides in length, and are capable of selectively hybridizing, 
under stringent hybridizing conditions, to a target region, a specific subsequence of 
a DPD nucleic acid. The probes selectively hybridize to DPD nucleic acids, meaning 
that under stringent hybridization conditions the probes do not substantially 
hybridize to non-DPD nucleic acids (less than 50% of the probe molecules hybridize 
to non-DPD nucleic acids). One of skill will recognize that oligonucleotide probes 
complementary to specific subsequences of the target regions, but not to the entire 
target region, will also function in the claimed assays so long as such probes 
20 selectively hybridize to the target regions. 

Alternatively, the oligonucleotide probe can comprise a concatemer 
that has the formula lX-Y-Z|n, wherein: 

a) X is a sequence of 0 to 1 00 nucleotides or nucleotide analogs 
that are not complementary to a DPD nucleic acid; 

25 b) Y is a sequence of 10 to 100 nucleotides or nucleotide analogs 

that are capable of hybridizing under stringent hybridizing conditions to a DPD 
nucleic acid; 

c) Z is a sequence of nucleotides the same as or different from X, 
such that nucleotides or nucleotide analogs are not complementary to a DPD 

30 nucleic acid; and 

d) n is 1 -500. or more and. where n is greater than 1 , Y can be 
the same or diff rent sequences of nucleotides having the indicated hybridization 
capability. The probe can be free or contained within a vector sequence (e.g.. 
plasmids or single stranded DNA). 
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The degree of complementarity (homology) required for detectable 
binding with the DPD nucleic acids will vary in accordance with the stringency of 
the hybridization medium and/or wash medium. The degree of complementarity will 
optimally be 100 percent; however, it should be understood that minor variations in 
the DPD nucleic acids may be compensated for by reducing the stringency of the 
hybridization and/or wash medium as described below. Thus, despite the lack of 
100 percent complementarity under reduced conditions of stringency, functional 
probes having minor base differences from their DPD nucleic acid targets are 
possible. Therefore, under hybridization conditions of reduced stringency, it may be 
possible to modify up to 60% of a given oligonucleotide probe while maintaining an 
acceptable degree of specificity, in addition, analogs of nucleosides may be 
substituted within the probe for naturally occurring nucleosides. This invention is 
intended to embrace these species when referring to polynucleic acid probes. 

Suitable oligonucleotide probes include synthetic oligonucleotides, 
cloned DNA fragments, PCR products, and RNA molecules. The nature of the 
probe is not important, provided that it hybridizes specifically to DPD nucleic acids, 
and not to other nucleic acids under stringent hybridization conditions. 

To obtain large quantities of DNA or RNA probes, one can either clone 
the desired sequence using traditional cloning methods, such as described in 
Sambrook et a/.. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, 
New York, 1989, or one can produce the probes by chemical synthesis using 
commercially available DNA synthesizers. An example of cloning would involve 
insertion of all or part of the cDNA for the human or pig DPD gene into a repiicati n 
vector, such as pBR322, M13, or into a vector containing the SP6 promotor [e.g., 
for generation of single-stranded DPD RNA using SP6 RNA polymerase), and 
transformation of a bacterial host. The probes can be purified from the host cell by 
lysis and nucleic acid extraction, treatment with selected restriction enzymes, and 
further isolation by gel electrophoresis. 

Oligonucleotide probes can be chemically synthesized using 
commercially available methods and equipment. For example, the solid phase 
phosphoramidite triester method first described by Beaucage and Carruthers 1(1981) 
Tetrahedron Lett. 22: 1859-18621 is suitable. This method can be used to produce 
relatively short probes of between 10 and 50 bases. The triester method described 
by Matteucci et al. [(1981) J. Am. Chem. Soc, 103:3185] is also suitable for 
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synthesizing oligonucleotide probes. Conveniently, one can use an automated 
oligonucleotide synthesizer such as the Model 394 DNA/RNA Synthesizer from 
Applied Biosystems (Foster City. CA) using reagents supplied by the same 
company. 

5 After synthesis, the oligonucleotides are purified either by native 

acrylamide gel electrophoresis or by anion-exchange HPLC as described in, for 
example, Pearson and Regn.er (1983) J. Chrom. 255: 137-149. The sequence of 
the synthetic oligonucleotide can be verified using the chemical degradation method 
of Maxam, A.M. and Gilbert, w. (1980) In Grossman, L. and Moldave, D.. eds. 
10 Academic Press, New York, Methods in Emymology, 65:499-560. 

Probes can be comprised of the natural nucleotides or known analogs 
of the natural nucleotides, including those modified to bind labeling moieties. 
Oligonucleotide probes that comprise thionucleotides, and thus are resistant to 
nuclease cleavage, are also suitable. One can use probes that are the full length of 
1 5 the DPD coding regions, or probes that hybridize to a specific subsequence of a 

DPD gene. Shorter probes are empirically tested for specificity. Preferably, nucleic 
acid probes are 1 5 nucleotides or longer in length, although oligonucleotide probe 
lengths of between about 1 0 and 1 00 nucleotides or longer are appropriate. 
Sambrook, supra, describes methods for selecting nucleic acid probe sequences for 
20 use in nucleic acid hybridization. 

For purposes of this invention, the probes are typically labelled so that 
one can detect whether the probe has bound to a DPD nucleic acid. Probes can be 
labeled by any one of several methods typically used to detect the presence of 
hybrid polynucleotides. The most common method of detection is the use of 
25 autoradiography using probes labeled with 3 H, ,25 l, M S. ,4 C. "P, or the like. The 
choice of radioactive isotope depends on research preferences due to ease of 
synthesis, stability, and half lives of the selected isotopes. Other labels include 
ligands which bind to antibodies labeled with fluorophores. chemiluminescent 
agents, and enzymes. Alternatively, probes can be conjugated directly with labels 
30 such as fluorophores. chemiluminescent agents or enzymes. The choice of label 
depends on sensitivity required, ease of conjugation with the probe, stability 
requirements, and available instrumentation. 

The choice of label dictates the manner in which the label is bound to 
the probe. Radioactive probes are typically made using commercially available 
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nucleotides containing the desired radioactive isotope. The radioactive nucleotides 
can be incorporated into probes, for example, by using DNA synthesizers, by nick 
translation or primer extension with DNA polymerase I, by tailing radioactive 
nucleotides to the 3' end of probes with terminal deoxy nucleotidyl transferase, by 
5 incubating single-stranded M13 plasmids having specific inserts with the Klenow 
fragment of DNA polymerase in the presence of radioactive deoxynucleotides, 
dNTP, by transcribing from RNA templates using reverse transcriptase in the 
presence of radioactive deoxynucleotides, dNTP, or by transcribing RNA from 
vectors containing specific RNA viral promoters (e.g., SP6 promoter) using the 

10 corresponding RNA polymerase (e.g., SP6 RNA polymerase) in the presence of 
radioactive ribonucleotides rNTP. 

The probes can be labeled using radioactive nucleotides in which the 
isotope resides as a part of the nucleotide molecule, or in which the radioactive 
component is attached to the nucleotide via a terminal hydroxyl group that has 

15 been esterified to a radioactive component such as inorganic acids, e.g.. 32 P 

phosphate or l4 C organic acids, or esterified to provide a linking group to the lab L 
Base analogs having nucleophilic linking groups, such as primary amino groups, can 
also be linked to a label. 

Non-radioactive probes are often labeled by indirect means. For 

20 example, a ligand molecule is covalently bound to the probe. The ligand then binds 
to an anti-ligand molecule which is either inherently detectable or covalently bound 
to a detectable signal system, such as an enzyme, a fluorophore, or a chemilumi- 
nescent compound. Ligands and anti-ligands may be varied widely. Where a ligand 
has a natural anti-ligand, namely ligands such as biotin, thyroxine, and Cortisol, it 

25 can be used in conjunction with- its labeled, naturally occurring anti-ligands. 

Alternatively, any haptenic or antigenic compound can be used in combination with 
an antibody. 

Probes can also be labeled by direct conjugation with a label. For 
example, cloned DNA probes have been coupled directly to horseradish peroxidase 
30 or alkaline phosphatase/ as described in Renz. M., and Kurz, K. (1984) A Colori- 

metric Method for DNA Hybridization. Nucl. Acids Res. 12: 3435-3444. Synthetic 
oligonucleotides have been coupled directly to alkaline phosphatase (Jablonski, E., 
et a/. (1986) Preparation of Oligodeoxynucleotide- Alkaline Phosphatase Conjugates 
and Their Use as Hybridization Probes. NucL Acids Res. 14: 61 1 5-61 28; and Li P., 
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era/. (1987) Enzyme-linked Synthetic Oligonucleotide probes: Non-Radioactive 
Detection of Enterotoxigenic Escherichia coli in Faeca Specimens. Nucl. Acids Res. 
15: 5275-5287). 

Enzymes of interest as labels will typically be hydrolases, such as 
5 phosphatases, esterases and glycosidases, or oxidoreductases, particularly 
peroxidases. Fluorescent compounds include fluorescein and its derivatives, 
rhodamine and its derivatives, dansyl, umbelliferone, etc. Chemiluminescers include 
luciferin, and 2,3-dihydrophthalazinediones, e.g., luminol. 

The oligonucleotide or polynucleotide acid probes of this invention can 

10 be included in a kit which can be used to rapidly determine the level of DPD DNA or 
mRNA in cells of a human or other mammalian sample. The kit includes all 
components necessary to assay for the presence of the DPD DNA or mRNA. In the 
universal concept, the kit includes a stable preparation of labeled probes specific for 
DPD nucleic acids, hybridization solution in either dry or liquid form for the hybrid- 

1 5 ization of target and probe polynucleotides, as well as a solution for washing and 
removing undesirable and nonduplexed polynucleotides, a substrate for detecting 
the labeled duplex, and optionally an instrument for the detection of the label. 

The probe components described herein include combinations of 
probes in dry form, such as lyophilized nucleic acid or in precipitated form, such as 

20 alcohol precipitated nucleic acid or in buffered solutions. The label can be any of 
the labels described above. For example, the probe can be biotinytated using 
conventional means and the presence of a biotinylated probe can be detected by 
adding avidin conjugated to an enzyme, such as horseradish peroxidase, which can 
then be contacted with a substrate which, when reacted with peroxidase, can be 

25 monitored visually or by instrumentation using a colorimeter or spectrophotometer. 
This labeling method and other enzyme-type labels have the advantage of being 
economical, highly sensitive, and relatively safe compared to radioactive labeling 
methods. The various reagents for the detection of labeled probes and other 
miscellaneous materials for the kit, such as instructions, positive and negative 

30 controls, and containers for conducting, mixing, and reacting the various compo- 
nents, would complete the assay kit. 

2. Assays for Detecting DPD Nucleic Acid Deficiency 
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One embodiment of the invention provides assays for determining 
whether a patient is at risk of a toxic reaction to 5-fluorouracil, or suffers from a 
condition that is caused by inadequate levels of DPD (such as thymine uraciiuria). 
The assay methods involve determining whether the patient is deficient in DPD 
nucleic acids. A deficiency can arise if the patient is lacking all or part of one or 
both copies of the DPD gene, or if the DPD gene is not expressed in the appropriate 
cells of the patient. Another potential cause of DPD deficiency that is detectable 
using the claimed invention is a nonsense or missense mutation in the DPD gene 
that results in an abnormal DPD polypeptide. 

Assay test protocols for use in this invention are those of convention 
in the field of nucleic acid hybridization, and include both single phase, where the 
target and probe polynucleic acids are both in solution, and mixed phase 
hybridizations, where either the target or probe polynucleotides are fixed to an 
immobile support. The assay test protocols are varied and are not to be considered 
a limitation of this invention. A general review of hybridization can be had from a 
reading of Nucleic Acid Hybridization: A Practical Approach. Hames and Higgins, 
eds., IRL Press, 1 985; and Hybridization of Nucleic Acids Immobilized on Solid 
Supports. Meinkoth and Wah (1984) Analytical Biochemistry, pp. 238. 267-284. 
Mixed phase hybridizations are preferred. 

One potential cause of DPD deficiency is a deletion of all or part of 
one or more copies of the DPD gene in a patient's chromosomal DNA. To 
determine whether a patient lacks a gene that codes for DPD, the clinician can 
employ a Southern blot or other means suitable for detecting the presence of a 
specific nucleotide sequence in genomic DNA. A variety of methods for specific 
DNA and RNA measurement using nucleic acid hybridization techniques are known 
to those of skill in the art. See, e.g.. Sambrook. supra. Briefly, the procedure for a 
Southern blot is as follows. Genomic DNA is isolated from a sample obtained from 
the patient. One can obtain DNA from almost any cellular tissue of the patient. 
The DNA is digested using one or more restriction enzymes, after which it is size- 
fractionated by electrophoresis through an agarose slab gel. The DNA is then 
immobilized by transfer from the gel to a membrane (commonly nylon or 
nitrocellulose). 

If all or part of the DPD gene is missing from the patient's genomic 
DNA. the probe will not hybridize to the genomic DNA. or else will hybridize to a 
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different-sized restriction fragment compared to the wild-type DPD gene. If a 
patient is heterozygous at the DPD locus, the clinician will observe either a reduced 
hybridization signal compared to wild-type (probe region deleted from one of the 
two alleles) or hybridization to two different-sized restriction fragments (part of one 
5 DPD gene deleted). If a sample from a patient lacks a gene that codes for DPD, the 
clinician should exercise extreme caution in using 5-FU as chemotherapy. A patient 
who is missing all or part of one or both DPD genes (e.g., either a heterozygote or 
homozygote for a defective DPD gene) is at risk of 5-FU toxicity or conditions such 
as thymine uraciluria that are due to inadequate levels of DPD activity. 

10 DPD deficiency that results in 5-FU toxicity or thymine uraciluria might 

also result from insufficient DPD mRNA levels. The Northern blot is a particularly 
useful method for detecting DPD mRNA levels. By detecting DPD mRNA levels, 
rather than detecting the presence of the DPD gene. Northern blots permit 
quantitation of DPD gene expression. This facilitates identification of patients who 

1 5 are DPD deficient for any of several reasons. A homozygote in which both DPD 
alleles are deleted will produce no DPD mRNA, while a heterozygote will generally 
have an intermediate level of DPD mRNA compared to a patient who is homozygous 
wild type. A Northern blot also allows the clinician to identify patients who, 
although they carry DPD genes, have a lower than normal level of DPD gene 

20 expression. Such patients are also at risk of 5-FU toxicity and thymine uraciluria. 

Suitable samples for detection of DPD mRNA include any cells from 
the patient that express the DPD gene. Preferably, the ceils will be obtained from a 
tissue that has high levels of DPD activity. In humans, the liver and lymphocytes 
generally have the highest DPD activity, with other tissues having less activity 

25 INaguib et a/. (1985) Cancer Re„s. 45: 5405-5412]. Because lymphocytes are much 
easier to isolate from a patient than liver cells, lymphocytes are a preferred sample 
for detecting DPD mRNA according to the claimed invention. However, one can 
also detect DPD mRNA in other cell types, such as fibroblasts. 

Suitable methods for Northern blots are described in, for example, 

30 Sambrook, supra, and Chomczynski and Sacchi (1987) Anal. Biochem. 162: 156- 
159. Briefly, RNA is isolated from a cell sample using an extraction solution that 
releases the RNA from the cells while preventing degradation of the RNA. A 
commonly-used extraction solution contains a guanidinium salt. The RNA is purified 
from the extraction solution, such as by phenol-chloroform extraction followed by 
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ethanol precipitation. Optional one can separate the mRNA from ribosoma, RNA 
and transfer RNA by o.,go-dT cellulose chromatography, although such purification 
■s not required to pract.ee the Ca.med ,nvent,on. The RNA is then size-fractionated 
by electrophoresis, after which the RNA .is transferred from the gel to a 
n,trocel.u.ose or nylon membrane. Labeled probes are used to ascertain the 
presence or absence of DPD-encoding mRNA. 

W a sample from a patient has an insufficient amount of DPD nucleic 
ac,ds. the patient ,s at risk of a toxic reaction to 5-FU. or ,s likely to suffer from 
thymine uraciluna or a related condition. Generally, an .sufficient amount of DPD 
nucleic adds is less than about 70% of the norma, amount of DPD nucleic acid 
where -norma," refers to the amount of DPD nuc.e.c add found ,n the same ' 
amount of DNA or RNA from a sample that is not known to have a DPD deficiency 
More typically, an amount of DPD that .s less than about SO"/, of normal .s 
.nd.cat.ve of an enhanced risk of 5-FU toxicity or thymine uraci.ur.a. 

Yet another potential cause of DPD deficiency in a patient is a 
m.ssense or nonsense mutation ,n the DPD gene, or a mutation that interferes with 
mRNA processing. Our invention allows the clinician to detect these mutations By 
choosmg a probe that hybridizes to a mutant DPD gene, but not to the wild-type 
DPD gene (or vice versa), one can determine whether the patient carries an 
abnormal DPD gene that may result in inadequate expression of the DPD gen or 
expression of an abnorma, DPD enzyme that has less activity than the wi,d-type 
enzyme. 

A variety of nucleic acid hybridization formats in addition to Northern 
and Southern b,ots are known to those skilled in the art. For example, common 
formats include sandwich assays and competition or displacement assays 
Hybridization techniques are generaHy described in ' NuCeic Acid Hybridization A 
Practice, Approach." Hames, B.D. and Higgins. S.J. (eds.), IRL Press. 1985; Ga.l 
and Pardue (1969, Proc. Nat,. Acad. Sci. USA. 63: 378-383; and John eta, 
(1969, Nature 223: 582-587. These assays are sometimes preferred over dassica, 
Northern and Southern blots because of their greater speed and simplicity. 

Sandwich assays are commercially useful hybridization assays for 
detecting or isolating nucleic acd sequences. These assays are easily automated, 
which results in a more cost-effective and sometimes more accurate assay. 
Sandwich assays utilize a "capture" nucleic acid that is covalently linked to a so,id 
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support, and a labelled "signal" nucleic acid that is in solution. The clinical sample 
provides the target nucleic acid. The "capture" nucleic acid and "signal" nucleic 
acid probe each hybridize to the target nucleic acid to form a "sandwich" 
hybridization complex. To be effective, the signal nucleic acid cannot hybridize to 
the capture nucleic acid. 

One embodiment of this invention embraces a kit that utilizes the 
concept of the sandwich assay. This kit includes a first component for the 
collection of samples from patients, vials for containment, and buffers for the 
dispersement and lysis of the sample. A second component contains media in 
either dry or liquid form for the hybridization of target and probe polynucleotides, as 
well as for the removal of undesirable and nonduplexed forms by washing. A third 
component includes a solid support upon which is fixed or to which is conjugated 
unlabeled nucleic acid probe(s) that is(are) complementary to a DPD nucleic acid, in 
the case of multiple target analysis more than one capture probe, each specific for 
its own DPD nucleic acid target region, will be applied to different discrete regions 
of the dipstick. A fourth component contains labeled probe that is complementary 
to a second and different region of the same DPD nucleic acid strand to which the 
immobilized, unlabeled nucleic acid probe of the third component is hybridized. 

No matter which assay format is employed, labelled signal nucleic 
acids are typically used to detect hybridization. Complementary nucleic acids or 
signal nucleic acids can be labelled by any one of several methods typically used to 
detect the presence of hybridized polynucleotides, as described above. The most 
common method of detection is the use of autoradiography with 3 H. ,2S |, M S, '*C, or 
"P-labelled probes or the like. Other labels include ligands which bind to labelled 
25 antibodies, fluorophores. chemituminescent agents, enzymes, and antibodies which 
can serve as specific binding pair members for a labelled ligand. 

Detection of a hybridization complex may require the binding of a 
signal generating complex to a duplex of target and probe polynucleotides or nucleic 
acids. Typically, such binding occurs through ligand and anti-ligand interactions as 
between a ligand-conjugated probe and an anti-ligand conjugated with a signal. The 
label can also allow indirect detection of the hybridization complex. For example, 
where the label is a hapten or antigen, the sample can be d tected by using 
antibodies. In these systems, a signal is generated by attaching fluoresc nt or 
enzyme molecules to the antibodies or, in some cases, by attachment to a 
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radioactive label. [Tijssen. P., "Practice and Theory of Enzyme Immunoassays." 
Laboratory Techniques in Biochemistry and Molecular Biology. Burdon, R.H., van 
Knippenberg, P.H., Eds., Elsevier (1985). pp. 9-20). 

The sensitivity of the hybridization assays can be enhanced through 
use of a nucleic acid amplification system that multiplies the target nucleic acid 
being detected. Examples of such systems include the polymerase chain reaction 
(PCR) system and the ligase chain reaction (LCR) system. Other methods recently 
described in the art are the nucleic acid sequence based amplification (NASBA™. 
Cangene. Mississauga, Ontar.o) and Q Beta Replicase systems. Amplification 
methods permit one to detect the presence or absence of DPD nucleic acids using 
only a very small sample. Furthermore, amplification methods are especially 
amenable to automation. 

One preferred method for detecting DPD deficiency is reverse 
transcriptase PCR (RT-PCR). Briefly, this method involves extracting RNA from th 
sample being analyzed, making a cDNA copy of the mRNA using an oligo-dT primer 
and reverse transcriptase, and finally amplifying part or all of the cDNA by PCR. 
For primers, one can use oligonucleotide primers that are complementary to the 5' 
and 3' sequences that flank the DNA region to be amplified. One can select 
primers to amplify the entire region that codes for a full-length DPD polypeptide, or 
to amplify smaller DNA segments that code for part of the DPD polypeptide, as 
desired. For human DPD analysis, suitable pairs of primers include: SEQ. ID Nos. 5 
and 6, SEQ. ID Nos. 7 and 8. and SEQ. ID Nos. 9 and 10. A detailed example of 
RT-PCR analysis as used for detection of DPD deficiency is presented in Example 4 
below. 

An alternative means for determining the level at which a DPD gene is 
expressed is in situ hybridization. In situ hybridization assays are well known and 
are generally described in Angerer et al. (1 987) Methods Enzymol. 1 52: 649-660. 
In an in situ hybridization assay, cells are fixed to a solid support, typically a glass 
slide. If DNA is to be probed, the cells are denatured with heat or alkali. The c lis 
are then contacted with a hybridization solution at a moderate temperature to 
permit annealing of labeled probes specific to DPD-encoding nucleic acids. The 
probes are preferably labelled with radioisotopes or fluorescent labels. 
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C. Expression of Recombinant Dihvdropyrimidine Dehydrogenase 

The present invention also provides methods for expressing 
recombinant dihydropyrimidine dehydrogenase (DPD). These methods involve 
cloning the claimed isolated DPD cDNA into an appropriate expression vector, 
transforming the expression vector into a host cell, and growing the host cells 
under conditions that lead to expression of the DPD cDNA. Numerous expression 
systems are suitable for expression of cDNA encoding DPD. Because these basic 
techniques are known to those of skill in the art, no attempt is made here to 
describe in detail the various basic methods known for the expression of proteins in 
prokaryotes or eukaryotes. 

In brief summary, the expression of natural or synthetic nucleic acids 
encoding DPD will typically be achieved by operably linking a DPD-encoding cDNA 
to a promoter that functions in the host cell of choice. Either constitutive or 
inducible promoters are suitable. This "expression cassette" is typically 
incorporated in an expression vector. The vectors contain regulatory regions that 
cause the vector to replicate autonomously in the host cell, or else the vector can 
replicate by becoming integrated into the genomic DNA of the host cell. Suitable 
vectors for both prokaryotes and eukaryotes are known to those of skill in the art- 
Typical expression vectors can also contain transcription and translation 
terminators, translation initiation sequences, and enhancers that are useful for 
regulating the amount of DPD expression. To obtain high level expression of a 
cloned gene, such as those polynucleotide sequences encoding DPD, it is desirable 
to construct expression vectors that contain, at minimum, a strong promoter to 
direct transcription, a ribosome binding site for transiational initiation, and a 
transcription/ translation terminator. Expression vectors often contain control 
elements that permit the vector to replicate in both eukaryotes and prokaryotes, as 
well as selectable markers that function in each. See, e.g., Sambrook, supra., for 
examples of suitable expression vectors. 
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1 • Expression in Eukarvotes 

A variety of eukaryotic expression systems such as yeast, insect cell 
lines, bird, fish, and mammalian cells, are known to those of skill in the art. 
Eukaryot.c systems, including yeast, mammalian, and insect, suitable for expressing 
DPD are discussed briefly below, . 

Synthesis of heterologous proteins in yeast is well known. Methods 
in Yeast Genetics. Sherman, F., et a/.. Cold Spring Harbor Laboratory, (1982) is a 
well recognized work describing the various methods available to produce the 
protein in yeast. Suitable vectors for expression in yeast usually have expression 
control sequences, such as promoters, including 3-phosphoglycerate kinase or other 
glycolytic enzymes, and an orig.n of replication, termination sequences and the like 
as desired. For .nstance. suitable vectors are described in the literature (Botstein. et 
a/.. 1979, Gene. 8:17-24; Broach, eta/.. 1979. Gene. 8:121-133). Several 
commercial manufacturers of molecular biology reagents sell expression vectors 
that are suitable for use in different eukaryotic host cells ISee, e.g.. product 
catalogs from Stratagene Cloning Systems, La Jolla Ca; Clontech Laboratories, 
Palo Alto CA; Promega Corporation. Madison wi|. These vectors are used as 
directed by the manufacturers except for the modifications described below that are 
necessary for expression of DPD. 

Two procedures are commonly used to transform yeast cells. The 
first method involves converting yeast cells into protoplasts using an enzyme such 
as zymolyase, lyticase or glusulase. The protoplasts are then exposed to DNA and 
polyethylene glycol (PEG), after which the PEG-treated protoplasts are then 
regenerated in a 3% agar medium under selective conditions. Details of this 
procedure are given in the papers by Beggs (1978) Nature (London) 275: 104-109 
and Hinnen eta/. (1978) Proc. Natl. Acad. Set. USA 75: 1929-1933. The second 
procedure does not involve removal of the cell wall. Instead the cells are treated 
with lithium chloride or acetate and PEG and put on selective plates Uto et at. 
(1983) J. Bact. 153: 163-1681. 

The DPD polypeptides, once expressed, can be isolated from yeast by 
lysing the cells and applying standard protein isolation techniques to the lysates. 
The monitoring of the purification process can be accomplished by using Western 
blot techniques, or radioimmunoassay or other standard immunoassay techniques. 
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Higher eukaryotes are also suitable host cells for expression of 
recombinant DPD. Again, previously described methods are suitable, except that 
the modifications described below are necessary for efficient expression of DPD. 
Expression vectors for use in transforming, for example, mammalian, insect, bird, 
and fish cells are known to those of skill in the art. 

Mammalian cells are illustrative of the techniques used for expression 
of DPD in eukaryotic cells. Mammalian cells typically grow in the form of 
monolayers of cells, although mammalian cell suspensions may also be used. A 
number of suitable host cell lines capable of expressing intact proteins have been 
developed in the art, and include the HEK293, BHK21, and CHO cell lines, and 
various human cells such as COS cell lines, HeLa cells, myeloma cell lines, Jurkat 
cells, etc. Expression vectors for these cells can include expression control 
sequences, such as an origin of replication, a promoter {e.g., the CMV promoter, a 
HSV tk (thymidine kinase) promoter or pgk (phosphoglycerate kinase) promoter), an 
enhancer [Queen et aL (1986) Immunol. Rev. 89:49), and necessary processing 
information sites, such as ribosome binding sites, RNA splice sites, polyadenylation 
sites [e.g.. an SV40 large T Ag poly A addition site), and transcriptional terminator 
sequences. Other animal cells useful for production of recombinant DPD are 
available, for instance, from the American Type Culture Collection Catalogue of Cell 
Lines and Hybridomas (7th edition, 1992), as well as from various commercial 
manufacturers of molecular biology reagents. 

Insect cells are another eukaryotic system that is useful for expressing 
recombinant DPD protein. Appropriate vectors for expressing recombinant DPD in 
insect cells are usually derived from the SF9 baculo virus. Suitable insect cell lines 
include mosquito larvae, silkworm, armyworm, moth and Drosophila cell lines such 
as a Schneider cell line [See, Schneider J. (1987) EmbryoL Exp. Morphol. 27:353- 
365]. 

Higher eukaryotic host cells, such as mammalian and insect cells, are 
rendered competent for transformation by various means. There are several well- 
known methods of introducing DNA into animal cells. These include: calcium 
phosphate precipitation, fusion of the recipient cells with bacterial protoplasts 
containing the DNA, treatment of the recipient cells with liposomes containing the 
DNA, DEAE dextran, electroporation and micro-injection of the DNA directly into the 
cells. 
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The Transformed cells are cultured by means well known in the art. 
Biochemical Methods in Cell Culture and Virology, Kuchler, R.J., Dowden, 
Hutchinson and Ross, Inc (1977). The expressed polypeptides are isolated from 
cells grown as suspensions or as monolayers. The DPD polypeptides are recovered 
by well known mechanical, chemical or enzymatic means. 

2. Expression in Prokarvotes 

A variety of prokaryotic expression systems can be used to express 
recombinant DPD. Examples of suitable host cells include £ coli. Bacillus. 
Streptomyces. and the like. For each host cell, one employs an expression 
plasmids that contains appropriate signals that direct transcription and translation in 
the chosen host organism. Such signals typically include a strong promoter to 
direct transcription, a ribosome binding site for translational initiation, and a 
transcription/translation terminator. Examples of regulatory regions suitable for this 
purpose in £ coli are the promoter and operator region of the £ coli tryptophan 
biosynthetic pathway as described by Yanofsky, C. (1984) J. Bacterid. 158: 1018- 
1024 and the leftward promoter of phage lambda (pX) as described by Herskowitz 
and Hagen (1980) Ann. Rev. Genet. 14: 399-445. Several commercial 
manufacturers of molecular biology reagents sell prokaryotic expression vectors 
that have been optimized for high levels of heterologous gene expression [See, e.g., 
product catalogs from Stratagene Cloning Systems, La Jolla Ca; Clontech 
Laboratories, Palo Alto CA; Promega Corporation, Madison Wl). These vectors are 
especially suitable for producing recombinant DPD, and are used as directed by the 
manufacturer, except that modifications to the growth medium are required for DPD 
expression, as described below. 

Suitable expression vectors for use in prokaryotes typically contain a 
selectable marker that, when cells are grown under appropriate conditions, cause 
only those cells that contain the expression vector to grow. Examples of such 
markers useful in £ coli include genes specifying resistance to ampicillin, 
tetracycline, or chloramphenicol. See, e.g., Sambrook, supra, for details concerning 
selectable markers suitable for use in £ coli. 

Overexpression of DPD causes elimination of pyrimidines from cells. 
This results in selection against cells that produce high levels of DPD. The present 
invention provides methods to circumvent this problem. These methods involve 
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adding uracil to the growth medium. Addition of other cof actors such as FAD and 
FMN also has a beneficial effect, although not as great as for uracil addition. For 
expression of DPD in E. coii, for example, a preferred medium is Terrific Broth 
ITartof and Hobbs (1987) Bethesda Research Labs FOCUS 9: 12] that contains 100 
5 Afg/ml ampicillin or other antibiotic suitable for the selectable marker contained on 
the expression vector employed. To allow growth of cells that express DPD, the 
medium is typically supplemented with 100 pM uracil, and optionally 100 //M each 
of FAD and FMN, and 10 /vM each of Fe(NH 4 ) ? S0 4 and Na 2 S. 

Recombinant DPD produced by prokaryotic cells may not necessarily 

10 fold into the same configuration as eukaryotically-produced DPD. If improper 
folding inhibits DPD activity, one can "refold" the DPD polypeptide by first 
denaturing the protein, and then allowing the protein to renature. This can be 
accomplished by solubilizing the bacterially produced proteins in a chaotropic agent 
such as guanidine HCI, reducing all the cysteine residues by using a reducing agent 

15 such as ft-mercaptoethanol. The protein is then renatured, either by slow dialysis or 
by gel filtration. See, e.g., U.S. Patent No. 4,511,503. 

Detection of the expressed antigen is achieved by methods known in 
the art as radioimmunoassay, or Western blotting techniques or 
immunoprecipitation. Purification from £. coli can be achieved following procedures 

20 described in, for example, U.S. Patent No. 4,51 1,503. 

3. Purification of DPD Polypeptides 

The DPD polypeptides produced by recombinant DNA technology as 

described herein can be purified by standard techniques well known to those of .skill 
25 in the art. Typically, the cells are lysed (e.g., by sonication) and the protein is then 

purified to substantial purity using standard techniques such as selective 

precipitation with such substances as ammonium sulfate, column chromatography, 

immunopurification methods, and others. See, e.g., R. Scopes, Protein Purification; 

Principles and Practice, Springer-Verlag: New York (1982), which is incorporated 
30 herein by reference. For example, one can raise antibodies against the DPD 

polypeptides and use the antibodies for immunoprecipitation or affinity 

chromatography using standard methods. 
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If the DPD polypeptide is produced as a fusion protein, in which the 
DPD moiety is fused to non-DPD amino acids, the desired polypeptide can be 
released by digestion with an appropriate proteolytic enzyme. 

D - Use of D PD nucleic acids as selectable markers 

Another aspect of the claimed invention is the use of a DPD nucleic 
acid as a selectable marker that is effective in both prokaryotes and eukaryotes. 
Selectable markers are genes that, when present in a cloning vector, produce a 
gene product that enables cells containing the vector to grow under conditions that 
prevent cells lacking the vector from growing. In contrast to the selectable markers 
of the invention, most selectable markers function only in one or the other of 
eukaryotes and prokaryotes. not in both. Thus, cloning vectors that are intended 
for propagation in both types of organisms usually require two different selectabl 
markers. 

The claimed selectable markers are DPD-encoding nucleic acids. Cells 
that express these nucleic acids are resistant to 5-FU. 5-fluorouracil, which is toxic 
to both prokaryotic and eukaryotic cells, is degradatively inactivated by DPD. 
Therefore, one can select cells that contain a DPD nucleic acid that is operably 
linked to a promoter simply by growing the cells in the presence of 5-FU. To 
practice the invention, one operably links the DPD nucleic acid to a promoter that 
functions in the host cell of interest. Suitable promoters and other control signals 
are described above. In a preferred embodiment, the DPD nucleic acid is integrated 
into an expression cassette that functions in both prokaryotes and eukaryotes. One 
example of such a bifunctional expression cassette is the ZAP Express 1 " expression 
cassette (Stratagene, La Jolla CA), which is described in U.S. Patent No. 
5,128,256. The DPD nucleic acid is inserted into the multiple cloning site which is 
downstream of a tandem array that includes both prokaryotic and eukaryotic 
transcription and translation regulatory sequences. 

To determine appropriate growth conditions for using the DPD 
selectable marker, one first tests the untransformed host cells of interest for ability 
to grow in medium containing various amounts of 5-FU. A 5-FU concentration that 
results in complete or nearly complete inhibition of host cell growth is then 
employed in the medium used to select transformants. The amount of 5-FU 
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required may vary depending on the particular medium used, the host cells, and 
whether the cells are grown in liquid culture or on a solid medium such as agar. 



10 



EXAMPLES 

Example 1 : Clonino and Characterization of Piq and Human DPP cDNAs 

In this Example, we describe the cloning and characterization of 
cDNAs for pig and human dihydropyrimidine dehydrogenases. 



MATERIALS AND METHODS 

We isolated total RNA from frozen pig liver using the method of 
Chirgwin era/. (1979) Biochemistry 18: 5294-5299, except that we used CsTFA 
(Pharmacia, Inc., Milwaukee, Wl) instead of CsCI. We extracted the RNA twice 

15 with phenol-chloroform emulsion and then ethanol precipitated the RNA prior to 
use. Next, we isolated poly(A) RNA by oligo (dT)-cellulose chromatography [Aviv 
and Leder (1977) Proc. Nat'L Acad. ScL USA 69: 1408-14121 and used it as a 
template for synthesis of cDNA. We used oligo-dT as a primer, and extended the 
primer using reverse transcriptase. Then, we made the cDNA double-stranded and 

20 cloned it into Xgt24A using a kit supplied by Gibco BRL Life Technologies, Inc., 
Gaithersburg, MD. The DNA was packaged using the X packaging system from 
Gibco BRL. We plated the phage particles in Escherichia coti Y1090r. 

To identify plaques that express pig DPD, we screened the library 
using a polyclonal antibody against pig DPD (Podschun et a/. (1989) Eur. J. 

25 Biochem. 185: 21 9-224I. We obtained a partial cDNA that we used to rescreen the 
library in E. coii Y1088 by plaque hybridization. This yielded a cDNA that contained 
the complete DPD reading frame. We subcloned the cDNA into the Not\ and Sal\ 
sites of the plasmid vector pSport (Gibco BRL). 

To clone the human DPD cDNA, we used a fragment of the pig cDNA 

30 that includes most of the coding region to screen previously amplified human liver 

cDNA libraries that had been prepared in Xgt1 1 [Yamano et a/. (1989) Biochemistry 
28: 7340-73481. We isolated the human DPD cDNA as three overlapping 
fragments, which we subcloned into the Eco Rl site of pUC18. The three 
fragments were joined together using overlapping Cia I sites in pUC18. We then 

35 determined the complete sequences of pig and human DPD cDNAs using an Applied 
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Biosystems 3 73 A DNA sequencer, synthetic primers, and fluorescent dye 
terminator chemistry as described by the manufacturer. The oligonucleotide 
primers were synthesized using a CENTRICON 10*- filter (Millipore Corp.). Each 
base was determined at least once on both strands. The DNA and deduced amino 
acid sequences were analyzed using MacVector sequence analysis software 
(International Biotechnologies, Inc., New Haven, CT). 

RESULTS 

We isolated partial pig cDNAs by screening 1x10° plaques from an 
unamplified Agt22A library. After verification by sequencing, we used a partial 
cDNA to rescreen 500,000 plaques. Four cDNAs were isolated which contained 
inserts of about 4.5 kb. We completely sequenced one of these and found that it 
encompassed the full coding region of the protein (Figures 2A-2B). The deduced 
amino acid sequence of the amino terminal region agrees with the amino acid 
sequence determined from the pig enzyme [Podschun et a/. (1989) Eur. J. Biochem. 
185: 219-224. A number of segments of amino acids previously sequenced were 
found in the cDNA-deduced amino acid sequence (Figure 3. underlined). These 
were determined by cyanogen bromide cleavage (residues 1 17-127) and trypsin 
cleavage (residues 260-277; 308-315; 656-682; 904-913) followed by HPLC 
separation and sequencing (data not shown). The first residue of the amino 
terminal portion of the 1 2.000 dalton cleavage fragment from the pig DPD is shown 
by a vertical arrow at residue 904. These data establish the pig DPD open reading 
frame of 1025 amino acids. 

The nucleotide sequence of the human DPD is shown in Figures 1 A- 
1B. The deduced amino acid sequence of the human DPD is identical to that of the 
pig DPD, except where indicated in Figure 3. The calculated molecular weights are 
11,416 and 1 1 1,398 daltons for pig and human DPD. respectively. The poly(A) 
addition sequence of AAATAAA is found 1 7 bp upstream of a putative poly(A) tract 
cloned in the cDNA. This 3*-untranslated region was not isolated in the human 
cDNA clones. 

The cDNA-derived protein sequences revealed the presence of a 
number of putative binding sites for known DPD cof actors. Recent EPR 
measurements on DPD from Alcaligenes eutrophus confirmed the existence of FMN, 
iron, and acid-labile sulfide, the latter two of which are indicative of iron sulfur 
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clusters (Schmitt et aL (1994) J. Inorg. Biochem. (in press). The C-terminal 12 kDa 
peptide fragment purified from the pig DPD shows absorbance in the 500-600 nm 
region and contains eight iron and eight acid-labile sulfides (Podschun et aL (1989), 
supra.l. The binding site of iron-sulfur clusters contatn Cys residues, a large 
number of which are found in the N-terminal half of the protein. However, these do 
not exhibit the typical motif pattern seen in other well-characterized iron sulfur- 
containing proteins. In the C-terminal region of pig and human DPD are typical 
motifs CXXCXXCXXXCX (SEQ ID No. 11) and CXXCXXCXXXCP (SEQ ID No. 12) 
for l4Fe-4S] clusters IDupuis et aL (1991) Biochemistry 30: 2954-29601 between 
residues 953 and 964 and residues 986 and 997, respectively. These lie within the 
1 2 kDa iron-sulfur cluster-containing peptide fPodschun et aL (1 989), supra.]. No 
other [4Fe-4S] clusters were detected; however, other types of iron sulfur clusters 
such as [2Fe-2S) might be possible. 

A typical NADPH binding motif VXVXGXGXXGXXXAXXA (SEQ ID 
No. 13) IWierenga et aL (1985) Biochemistry 24: 1 346-1 357) begins with V-335, 
except that the Gly at position 10 is an Ala in pig and human DPD. A motif for FAD 
binding, TXXXXVFAXGD lEggink et aL (1990) J. MoL Biol. 212: 135-142), is in the 
N-terminal region starting with T-471 and ending with D-481. 

We elucidated the putative uracil binding site of DPD by incubating 
DPD in the presence of 5-iodouracil, a suicide inactivator of the bovine enzyme, and 
sequencing the modified chymotryptic peptide f Porter et al. (1991) J. Biol. Chem. 
266: 19988-19994). The corresponding sequence obtained is located between G- 
661 and R-678 in the primary protein sequence. Thus, the order of the functional 
domains of DPD is, from the N-termtnus, NADPH/NADP-FAD-uracil-[4Fe-4S). 

Example 2: Chromosome localization of the DPD oene 

We localized the DPD gene to a specific human chromosome using a 
somatic cell hybrid strategy. Human-mouse and human-hamster cell lines were 
generated and characterized as described by McBride et aL 1(1 982a) NucL Acids 
Res. 10: 8155-8170; (1982b) J. Exp. Med. 155: 1480-1490; (1982c) Proc. NaVL 
Acad. Sci. USA 83: 130-134). The human chromosome of each call line was 
determined by standard isoenzyme analyses as well as by Southern analysis with 
probes from previously localized genes, and frequently, by cytogenetic analysis. 
Southern blots of hybrid cell DNA restriction digests on positively charged nylon 
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membranes were prepared after (0.7%) agarose gel electrophoresis and hybridized 
at high stringency with 32 P-labeled probes under conditions allowing no more than 
10% divergence of hybridizing sequences. 

We localized the DPD gene to human chromosome 1 by Southern 
5 analysis of a panel of human/rodent somatic cell hybrid DNAs digested with Eco Rl 
using a 3' coding cDNA fragment as probe (Table 1). The gene segregated 
discordantly (> 14%) with all other human chromosomes. The 3' probe identified 
a series of bands in human DNAs ranging in size from 0.8 to 1 .5 kb. All hybridizing 
human bands appeared to cosegregate indicating that these bands were all present 

10 on the same chromosome. We then sub-localized the gene on chromosome 1 by 
analysis of hybrids containing spontaneous breaks and translocations involving this 
chromosome. One human/hamster hybrid with a break between NBAS (1p12) and 
PGM1 (1p22) retained the telomenc portion of the chromosome 1 short arm but the 
DPD gene was absent from this hybrid. Another human/hamster hybrid and a 

1 5 human/mouse hybrid each retained all, or nearly all, of the short arm of 

chromosome 1 including NRAS and all other short arm markers but all long arm 
markers were absent including a cluster of genes at 1q2l <trichohyalin, loricrin, and 
filaggrin); the human DPD gene was present in both of these hybrids. Finally, one 
additional human/hamster hybrid retained a centromeric fragment of chromosome 1 

20 with the breakpoints on the long arm and short arm proximal to 1q2l and proximal 
to 1p31, respectively, and human DPD was present in this hybrid. These r suits 
indicate that the DPD gene can be sublocalized to the region 1 p22-q21 . 

We confirmed these results by Southern analysis of the same panel of 
hybrids with a DPD 5' cDNA probe which detected 1 .5, 5.0, 8.7, and 1 1 .6 kb 

25 bands in human EcoRI digests. .Both probes were used to examine DNAs from ten 
unrelated individuals separately digested with 1 2 different restriction enzym s for 
RFLPs. However, no polymorphisms were detected. A large number of hybridizing 
bands were detected with both DPD probes and these bands cosegregated 
indicating that they are all localized to the centromeric region of human 

30 chromosome 1 {i.e.. 1p22-q21). A number of cross-hybridizing hamster and mouse 
bands were also identified with these probes. These results are consistent with th 
interpretation that there may be a single reasonably large gene (spanning at least 80 
kb) in each of these species, and all hybridizing bands arise from a single gene. 
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However, we currently cannot exclude the possibility that the many hybridizing 
bands arise from a cluster of tandemly linked genes. 

Recently, the human DPD gene (named "DPYD" by the human gene 
nomenclature committee) was more precisely mapped to 1p22 [Takai et af. (1994) 
(submitted for publication)]. 



Example 3: Expression of Piq DPD in E. coli 

In this Example, we demonstrate the heterologous expression of a 
DPD polypeptide in a prokaryotic organism. Because large amounts of DPD protein 
10 are toxic to the host cells under normal growth conditions, additional components 
such as uracil are required in the medium. 



METHODS 

Construction of the Expression Piasmid. We constructed an 

15 expression piasmid by subcloning the pig DPD cDNA into the vector pSE420 

(Invitrogen Corp., San Diego, CA). The cDNA contains an Nco I site coincident 
with the start codon (CCATGG) which was joined to the Nco I site in the vector 
that is in frame with the bacterial initiator Met. The pig DPD cDNA was inserted 
into pSE420 as an NcoUAfhW fragment from the pSPORT vector in which the pig 

20 DPD cDNA had previously been subcloned. 

DPD Expression in Escherichia coli. For each expression experiment, a 
single colony from a freshly made transformation of DH-5<* cells with the 
expression vector was inoculated in LB broth and grown to stationary phase. An 
aliquot from this culture was used to inoculate 250 ml of terrific broth containing 

25 100 pg/m\ ampicillin and supplemented with 100 pM of each FAD and FMN, 100 
pM uracil and 10 pM each of Fe<NH 4 ) 2 (SOJ and Na 2 S. Following a 90 min 
incubation at 29 °C, we induced the trp-lac promoter in the expression vector by 
the addition of 1 mM isopropyl-/?-d-thiogalacto-pyranoside (IPTG) and the culture 
was incubated for an additional 48 h. 

30 The cells were then sedimented, washed twice with 250 ml of 

phosphate buffered saline (PBS) and resuspended in 45 ml of 35 mM potassium 
phosphate buffer <pH 7.3) containing 20% glycerol, 10 mM EDTA, 1 mM DTT, 0.1 
mM PMSF and 2 pM leupeptin. The cell suspension was lysed at 4 C with four 30 
sec bursts of a Heat Systems sonicator model W 225-R at 25% of full power (Heat 
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Systems-Ultrason,cs, inc.. Plain v.ew NY). The resultant lysate was centrifuged ai 
100.000 x g for 60 m.n at 4°C. We then slowly added solid (NH 4 ) ? S0 4 to the 
supernatant at 4°C w.th gentle stirring to give a final concentration of 30% 
saturation. The precipitate was sediment«d and the pellet containing expressed 
DPD was resuspended in 5 ml of 35 mM potassium phosphate buffer (pH = 7.3) 
containing 1 mM EDTA/1 mM DTT and 0.1 mM PMSF. The protein solution was 
dialyzed at 4°C for 36 h against 3 changes of 4 liters each of buffer and stored at - 
70°C until further use. 



Catalytic assay. DPD activity was determined at 37 °C by measuring 
the decrease in absorbance at 340 nm associated with the oxidation of NADPH to 
NADP" The reaction mixture conta.ned 28 mM potassium phosphate buffer (pH 
7.3). 2 mM MgCI ? , 1 mM DTT, 60 »M NADPH and the expressed DPD in a final 
volume of 1 ml. The measurements were carried out using an Aminco DW-2000 
double beam spectrophotometer using a blank that contained the complete reaction 
mixture except substrate. The reactions were initiated by addition of substrate 
(uracil, 5-fluorourac.l or thymme). The catalytic activity was calculated as ,/mole of 
NADPH oxidized per minute and per mg of expressed DPD. Protein quantities w re 
determined using the bicinchronic (BCA) procedure from Pierce Chemical Co., 
Rockford, ID following the manufacturer's directions. 

Analysis of cDNA-Expressed DPD Protein. SDS-poiyacrylamide gel 
electrophoresis was carried out following the method of Laemmli 1(1970) Nature 
227: 680-6851 using 8% acrylamide slab gels. The SDS-page gels were transferred 
to a nitrocellulose membrane by. electroblotting for 90 min at 1 .5 mA/cm 2 [Towbin 
eta/. (1979) Proc. Nat't Acad. Sci. USA 76: 4350-4354]. The membranes w re 
blocked at room temperature using phosphate buffered saline (PBS) containing 
0.5% Tween 20 and 3% skim milk. After blocking, the membranes were incubated 
for 4 h at room temperature with rabbit anti pig DPD polyclonal antibody dilute 200- 
fold in PBS. The membranes were washed three times in PBS containing 0.5% 
Tween 20 and rinsed twice with PBS prior to addition of alkaline phosphatase- 
labeled goat anti-rabbit IgG. Incubation was continued for 90 min and the 
membranes were developed using the reagent BCIP/NBT (Kikegaard & Perry Labs. 
Gaithersburg, MD). 
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RESULTS 

The pig DPD was expressed in bacteria using the vector pSE 420 
which has a trp-lac promoter that is inducible by isopropyl-£-d-thiogalacto- 
pyranoside (IPTGJ. Optimal expression was obtained when cells were grown at a 
temperature between 26 °C and 30 °C. Growth at higher temperatures resulted in 
aggregation of the protein in inclusion bodies. A number of cofactors known to be 
associated with the enzyme were added to the medium; the most critical was uracil 
which resulted in a greater than five-fold increase in DPD expression levels, 
compared to cells grown in unsupplemented medium. 

The recombinantly expressed DPD enzyme comigrated with the intact 
102 kDa DPD purified from p,g liver and reacted with rabbit polyclonal antibody 
(Podschun et al. (1 989) supra.] directed against the pig enzyme. DPD protein was 
undetectable in cells containing the expression vector without the DPD cDNA 
insert. The DPD purified from pig liver frequently has a second higher mobility band 
of about 1 2 kDa that results from a protease-labile site that liberates the iron sulfur- 
containing C-terminal fragment IPodschun et al. (1989) supra.}. 

The bacterially-expressed enzyme is produced intact and could be 
significantly purified away from other E. coli proteins by a single ammonium sulfate 
fractionation. By use of the purified pig DPD as a standard, we estimate that 50 to 
1 00 mg of DPD were produced per liter of E. coli culture. 

We tested the recombinantly expressed DPD enzyme for ability to 
metabolize typical DPD substrates such as uracil, thymine and 5-fluorouracil. 
Kinetic studies revealed that the recombinant DPD follows the ping pong reaction 
mechanism as previously shown for purified pig DPD (Podschun et al. (1989), 
supra.]. The Km's of the recombinant DPD are of similar magnitude to the values 
published for the purified pig (Podschun et al. (1 989), supra.], human (Lu et at. 
(1 992) J. Biol. Chem. 267: 17102-17109] and rat DPD enzymes (Fuiimoto et al. 
(1991) J. Nutr. Sci. Vitaminol. 37: 89-98). The Vmax values of expressed DPD 
were about three to five-fold lower than the purified pig enzyme reflecting the fact 
that the expressed DPD was only partially purified. However, these data establish 
that the expressed enzyme reflects the properties of the purified pig liver DPD. 
Thus. E. coli should prove useful for examining any enzymatic variants obtained 
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through screening DPD-deficient individuals and for preparing large amounts of 
intact holoenzyme for physico-chemical analysis. 



5 

Example 4: Identification of mutations within DPYD oene 

In an effort to understand the genetic basis for DPD deficiency, we 
analyzed a Dutch family that included a DPD-deficient individual. We determined 
the phenotype for thymine metabolism and related it to the DPD protein content in 
10 fibroblasts. Then we identified the genetic defect using RT-PCR and found that the 
deficiency was due to a homozygous deletion in the DPD mRNA. The deleted 
portion corresponded to an exon in the DPYD gene. This phenotype/genotype 
relationship accounts for the DPD metabolic disorder in the patient. Additionally, 
we confirmed an autosomal recessive pattern of inheritance for DPD deficiency. 

15 

METHODS 

Isolation of RNA. RNA was isolated from cultures of human fibroblast 
corresponding to all five subjects used in this study by the guanidinium thtocyanate 
phenol-chloroform method IChomczynski and Sacchi (1987) Anal. Biochem. 162: 
20 1 56-1 59]. The RNA was dissolved in water and stored at -80°C until further use. 

RT-PCR. cDNA was synthesized by reverse transcription from total 
RNA isolated from cultured fibroblast. About 1 //g of total RNA was mixed with 
oligo-dT primers and incubated at 65 °C for 15 min to denature secondary structure 

25 in the template. The primed RNA was incubated for 60 min at 40°C in 20 //I of a 
reaction mixture containing 100 mM Tris-HCI (pH 8.3), 40 mM KCI, 10 mM MgCI 2 , 
50 AfM spermidine, 100 mM dNTPs, 4 mM sodium phosphate, 0.5 units placental 
RNase inhibitor and 0.5 units of AMV reverse transcriptase (Invitrogen, CA). The 
synthesis reaction was repeated once by the addition of 0.5 units of fresh revers 

30 transcriptase. The cDNA was made double stranded by PGR without further 

purification. The coding region of the cDNA was amplified in three fragment with 
the primer pairs indicated in Table 1 . 
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Table 1: Primer pairs for RT-PCR analysis of human DPD cDNA 
(hDPD) . 

5 Fragment Location in hDPD SEQ. ID 
afw p^;ff ied cDNA (nucleotides) . Primer sequence No. 

1.5 kb RTF1.3 6 - 55 5 t GCAAGGAGGGTTTGTCACTG3 1 5 

10 RTR1:1558 - 1536 5 • CCGATTCCACTGTAGTGTTAGCC3 1 6 

906 bp H13:1539 - 1558 5 ' TAACACTACAGTGGAATCGG3 1 7 

RTR4:24 4 5 - 2426 5 1 AAATCCAGGCAGAGCACGAG3 ' 8 

15 919 bp RTR5:2424 - 2447 5 * TGCTCGTGCTCTGCCTGGATTTCC3 • 9 

RTR5 :3343 - 3320 5 ■ ATTGAATGGTCATTGACATGAGAC3 • 10 

We carried out PCR in 50 p\ of a reaction mixture consisting of 10 
20 mM Tris-HCI (pH 8.3), 50 mM KCI, 2.5 mM MgCI 2 , 0.5 mM dNTPs, 1 //M 

primers and 2.5 units Taq polymerase (Perkin-Elmer Cetus). Thirty cycles wer 
used, each cycle consisted of denaturing at 96°C for 1 min, annealing at 55°C 
for 1 min and extending at 72°C for 2 min. The amplified products were 
extracted with 1 volume chloroform and purified by filtration through 
25 Centricon m 100 filter units (Amicon, Inc. Beverly WA). Typically, we used on 
fifth of the PCR product for DMA sequence analyses with an Applied 
Btosystems 373A automated sequencer and fluorescent dye-deoxy terminator 
chemistry. We elucidated appropriate primers for DNA sequencing from the 
DPD cDNA sequence disclosed herein and synthesized the primers using an 
30 Applied Biosystems 394 DNA & RNA synthesizer. Sequence data have been 
analyzed using Mac Vector 1 " sequence analysis software (International 
Biotechnologies)* 

PCR Product Analysis and Southern Blots. We analyzed the PCR 
35 fragments by electrophoresis through a 1 % agarose gel in the presence of 

ethidium bromide. Prior to Southern blotting, the gels were depurinated by a 20 
min incubation in 200 mM HCI, after which we denatured the DNA by a 20 min 
incubation in 0.5 M NaOH. The DNA was transf rred to Gene Screen Plus™ 
membranes (New England Biolabs) overnight in 0.5 M NaOH as the transfer 
40 solution. We fixed the DNA by baking at 80°C, prehybridized at 65 °C for 3 h 
in a solution containing 6X SSC, 1X Denhardt's reagent, 0.5 ^ sodium dodecyl 



WO 96/08568 



PCT7US95/12016 



40 



sulfate and 0.2 mg/ml sonicated salmon sperm DNA. We then hybridized 
overnight at 65°C in the same solution containing 1.5 x 10« cpm/ml of 
random priming labelled human DPD cDNA. After washing at 65°C for 20 min 
in 2 x SSC, 0.5% SDS and 45 min 0.T x SSC, 0.5% SDS at 65«C. th 
membranes were exposed to X-ray film (Eastman Kodak, Co.) at -80°C for 30 



mm. 



Western Immunoblots. We carried out SDS-PAGE gel 
electrophoresis using the method of Laemmli (1970) Nature 227: 107-11 1 
The gels were transferred to nitrocellulose by semi-dry electroblotting for 90 
min at 1.5 mA/cm*. We detected DPD polypeptides using rabbit anti-pig DPD 
pnmary antibody and the enhanced chemi.um.nescence (ECL) detection method 
(Amersham Corp.,, following the directions supplied by the manufacturer 
Protein concentrations were determined using the bicinchronic acid procedure 
(P.erce Chemical Co., Rockford, IL, using bovine serum albumin as standard. 

Catalytic Activity. We measured DPD activity in human fibroblast 
extracts by HPLC using a modification of the method described by Tuchman et 
Bl. (1989) Enzyme 42. 15-24. using fCJ-thymine as substrate. 

RESULTS 

Clinical evaluation. We have studied the genetic basis for the 
complete lack of DPD activity in one of the members of the pedigree shown in 
Figure 4. The patient (subject 4, was admitted to the hospital at the age of 25 
months with bilateral microphtalmia. iris and choroidea coloboma. and 
nystagmus, in addition to a gradually increasing psychomotor retardation. 
However, no growth retardation or neurological abnormalities were detected. 
All other members of the pedigree were healthy and showed no abnormalities. 
The patient was diagnosed to have severe thymine-uraciluria. Skin biopsies 
were taken in order to establish fibroblast cultures that were used in this study. 

RT-PCR analysis of the DPD mRNA in cultured fibroblasts. 
Fibroblast total RNA from every subject was subjected to RT-PCR. The PCR 
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products were hybridized with the ( 32 P)-labelled human DPD cDNA and the 
result is shown in Figure 5. The coding sequence of the DPD cDNA was fully 
amplified in three fragments that span 1500, 906 and 919 bp. All the 
fragments are present every subject, including the patient. The 1500 and 919 
5 bp fragments were constant in all subjects. However, the 906 bp fragment 
was found in only certain subjects and was in linkage disequilibrium with a 
fragment of 741 bp. The latter was homozygous in the deficient patient and 
found together with the predicted normal size fragment in both parents. One 
sibling was heterozygous and another was homozygous for the normal allele. 

10 To confirm the possibility of a deletion in the mRNA-derived cDNA associated 
with the DPYD alleles of these subjects, we sequenced the PCR fragments 
using nested primers and found that the 741 originated from the 906 bp 
fragment by a deletion of 165 bp. A schematic representation showing the 
structure of both mRNAs is shown in Figure 6. Through partial sequencing of 

15 the DPYD gene, we found that the deletion present in the mRNA was 

coincident with a splicing site located in the genomic sequence of the DPYD 
gene that comprises a 165 bp exon. We have also found that the DNA 
corresponding to the deletion is present in the genomic DNA from the fibroblast 
cell lines since, as shown in Figure 7, the deleted cDNA sequence can be 

20 amplified by PCR from the genomic DNA in the patient, as well as from 

genomic DNA from other members of the family. These results indicate that 
the variant transcript is not the result of a large deletion containing the missing 
exon, but rather is the result of a mutation that causes incorrect splicing. 

25 Catalytic activity and DPD protein content. DPD activities from 

the fibroblast cell lines were determined by HPLC (Table I). The maximum 
activity, 1 nmol h 1 mg protein \ corresponds to subject 3 that was 
homozygous for the normal mRNA. The parents and another sibling (subjects 
4, 5, and 2) present a lower value and the patient, subject 1, had background 

30 activity. It should be noted that the DPD activity obtained in human fibroblast 
is about 8-9 times lower than the equivalent activity in DPD from human 
lymphocytes. 
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To determine if the DPD protein content in our subjects follows a 
pattern similar to that of the catalytic activity, we measured fibroblast DPD 
protein by Western blots. DPD protein was not detectable in the patient, but 
was found in two other members of his family (subjects 2 and 4 in Figure 4) 
who were analyzed for comparison. 

The catalytic activity pattern correlates with the DPD protein 
content for the different subjects. As expected, the patient with only 
background DPD activity in his fibroblast has no detectable DPD band in the 
Western blot when using an anti-pig DPD polyclonal antibody, suggesting a 
complete lack of DPD protein. It is interesting to note that even though the 
DPD protein is defective and does not accumulate in the cell, the DPD mRNA is 
present, indicat.ng that the defective mRNA is not particularly unstable as 
compared to the mRNA encoding the active DPD protein. 

In conclusion, this study established with certainty that thymine 
uraciluria is due to a mutation in the DPYD gene. 

Unless defined otherwise, all technical and scientific terms used 
herein have the same meaning as commonly understood by one of ordinary skill 
in the art to which this invention belongs. Although any methods and materials 
similar or equivalent to those described can be used in the practice or testing of 
the present invention, the preferred methods and materials are now described. 
All publications and patent documents referenced in this application are 
incorporated by reference. 

It is understood that the examples and embodiments described 
herein are for illustrative purposes only and that various modifications or 
changes in light thereof will be suggested to persons skilled in the art and are to 
be included within the spirit and purview of this application and scope of the 
appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

ii) APPLICANT: GONZALEZ , Frank J. 

FERNANDEZ - SALGUERO , Pedro 

(ii) TITLE OF INVENTION: CLONING AND EXPRESSION OF cDNA FOR HUMAN 
DI HYDRO PYRIMI DINE DEHYDROGENASE 

(iii) NUMBER OF SEQUENCES: 13 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Townsend and Townsend Khourie and Crew 

(B) STREET: Steuart Streec Tower, One Market Plaza 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: US 

(F) ZIP: 94105-1493 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC comoatible 

:C; OPERATING SYSTEM: PC- DOS /MS - DOS 

iD) SOFTWARE: Patent In Release tf 1 . C , Version #1.25 

<vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US not yec designated 

(B) FILING DATE: 09-SEP-1994 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Smith, Timothy L. 

(B) REGISTRATION NUMBER: 3 5,367 

(C) REFERENCE/DOCKET NUMBER: 15280-210 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (415) 543-9600 
<B) TELEFAX: (415) 543-5043 



(2) INFORMATION FOR SEQ ID NO : 1 : 

vi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3957 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 88.. 3162 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1 . . 3957 

(D) OTHER INFORMATION: /product" "Human DPD" 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

AGACACGCTG TCACTTGGCT CTCTGGCTGG AGCTTGAGGA CGCAAGGAGG GTTTGTCACT 60 

GGCAGACTCG AGACTGTAGG CACTGCC ATG GCC CCT GTG CTC AGT AAG GAC 111 

Met Ala Pro Val Leu Ser Lys Asp 
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159 



207 



255 



303 



35X 



TCG GCG GAC ATC GAG AGT ATC CTG GCT TTA AAT CCT CGA ACA CAA APT 
Ser Ala Asp lie Glu Ser He Leu Ala Leu Asn £J Arg ?£ §J5 ?£ 
10 15 20 

CAT GCA ACT CTG TGT TCC ACT TCG GCC AAG AAA TTA GAC AAG AAA CAT 
His Ala Thr Leu Cys Ser Thr Ser Ala Lys Lys Leu Asp Lys £yt SI 
^ 5 30 35 40 

t"S fC* a™ ^ T : AT AAG AAC TGC TGT GAG AAG CTG GAG 

Trp Lys Arg Asn Pro Asp Lys Asn Cys Phe Asn Cys Glu Lys Leu Glu 

45 50 55 

AAT AAT TTT GAT GAC ATC AAG CAC ACG ACT CTT GGT GAG CGA GGA GCT 
Asn Asn Phe Asp Asp He Lys His Thr Thr Leu Gly Glu Arg Gly All 
60 65 70 

CTC CGA GAA GCA ATG AGA TGC CTG AAA TGT GCA GAT GCC CCG TGT CAG 
Leu Arg Glu Ala Mec Arg Cys Leu Lys Cys Ala Asp Ala Pro Cys Sfn 
75 80 8 5 

AAG AGC TGT CCA ACT AAT CTT GAT ATT AAA TCA TTC ATC ACA AGT ATT «o 

Lys Ser Cys Pro Thr Asn Leu Asp He Lys Ser Phe lie Thr tel zTl 

* u * 5 100 

GCA AAC AAG AAC TAT TAT GGA GCT GCT AAG ATG ATA TTT TCT GAC AAC 
Ala Asn Lys Asn Tyr Tyr Gly Ala Ala Lys Met lie Phe Ser Asp Asn 
A & 110 H5 120 

CCA CTT GGT CTG ACT TGT GGA ATG GTA TGT CCA ACC TCT GAT CTA TGT 
Pro Leu Gly Leu Thr Cys Gly Mec Val Cys Pro Thr Ser Asp Leu Cys 
125 130 ^ 135 

Sit r?3 r?* J? C ** T T A TAT GCC ACT GAA GAG GGA CCC ATT AAT ATT 
Val Gly Gly Cys Asn Leu Tyr Ala Thr Glu Glu Gly Pro He Asn lie 

140 145 150 

GGT GGA TTG CAG CAA TTT GCT ACT GAG GTA TTC AAA GCA ATG AGT ATC 
Gly Gly Leu Gin Gin Phe Ala Thr Glu Val Phe Lys A?a Met Ser He 
155 160 165 

^ G rT C AGA AAT CCT TCG CTG -CT CCC CCA GAA AAA ATG TCT GAA 
Pro Gin He Arg Asn Pro Ser Leu Pro Pro Pro Glu Lys Met Ser Glu 
170 175 180 

GCC TAT TCT GCA AAG ATT GCT CTT TTT GGT GCT GGG CCT GCA AGT ATA 687 
Ala Tyr Ser Ala Lys lie Ala ieu Phe Gly Ala Gly Pro Ala Ser lie 
185 190 . 195 200 

AGT TGT GCT TCC TTT TTG GCT CGA TTG GGG TAC TCT GAC ATC ACT ATA 735 
Ser Cys Ala Ser Phe Leu Ala Arg Leu Gly Tyr Ser Asp He Thr He 
205 210 215 

TTT GAA AAA CAA GAA TAT GTT GGT GGT TTA AGT ACT TCT GAA ATT CCT 783 
Phe Glu Lys Gin Glu Tyr Val Gly Gly Leu Ser Thr Ser Glu He Pro 
220 225 230 

CAG TTC CGG CTG CCG TAT GAT GTA GTG AAT TTT GAG ATT GAG CTA ATG 831 
Gin Phe Arg Leu Pro Tyr Asp Val Val Asn Phe Glu He Glu Leu Met 
235 240 245 

AAG GAC CTT GGT GTA AAG ATA ATT TGC GGT AAA AGC CTT TCA GTG AAT 879 
Lys Asp Leu Gly Val Lys He He Cys Gly Lys Ser Leu Ser Val Asn 
250 255 260 

GAA ATG ACT CTT AGC ACT TTG AAA GAA AAA GGC TAC AAA GCT GCT TTC 927 
Glu Met Thr Leu Ser Thr Leu Lys Glu Lys Gly Tyr Lys Ala Ala Phe 



447 



495 



543 



591 



639 
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45 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



26 5 

ATT GGA ATA 
He Gly He 



175 



280 



CTG ACG CAG 
Leu Thr Gin 



GTA 
val 



CCA 
Pro 



TTC 
Phe 
345 

ATC 
He 



GCC 
Ala 



TCG 
Ser 
330 

GAC 
Asp 



AAA 
Lys 
315 

ATA 
He 



TGT 

Cys 



GTC TTC 
Val Phe 



ATG GAG CTT 
Met Glu Leu 



CCA CGG AAG 
Pro Arg Lys 
395 

GTT CGG ACA 
Val Arg Thr 
410 

CAG ATG GTC 
Gin Met Val 
425 

GTT CTG AGT 
Val Leu Ser 



AAC AGA TGG 
Asn Arg Trp 



GAA 
Glu 



ACA 
Thr 



GCA 
Ala 



AAA 

Lys 
505 

CTA 
Leu 



GTG 
val 
490 

TAC 
Tyr 



TGG 
Trp 
475 

GAA 

Glu 



GTA 
Val 



CCC CTC 
Pro Leu 



ATG GCC GGA 
Met Ala Gly 



GGT TTG CCA GAA CCC AAT AAA GAT GCC ATC TTC CAA GGC 975 
Gly Leu Pro Glu Pro Asn Lys Asp Ala lie Phe Gin Gly 
235 290 295 

GAC CAG GGG TTT TAT ACA T : CC AAA GAC TTT TTG CCA CTT 1023 
Asp Gin Gly Phe Tyr Thr Ser Lvs Asd Phe Leu Pro Leu 
300 305 ' 310 

GGC AGT AAA GCA GGA ATG TGC GCC TGT CAC TCT CCA TTG 1071 
Gly Ser Lys Ala Gly Met Cys Ala Cys His Ser Pro Leu 
320 325 

CGG GGA GTC GTG ATT GTA CTT GGA GCT GGA GAC ACT GCC 1119 
Arg Gly Val Val He Val Leu Gly Ala Gly Asp Thr Ala 
335 340 

GCA ACA TCT GCT CTA CGT TGT GGA GCT CGC CGA GTG TTC 1167 
Ala Thr Ser Ala Leu Arg Cys Gly Ala Arg Arg Val Phe 
350 355 360 

AGA AAA GGC TTT GTT AAT ATA AGA GCT GTC CCT GAG GAG 1215 
Arg Lys Gly Phe Val Asn lie Arg Ala Val Pro Glu Glu 
365 370 375 

GCT AAG GAA GAA AAG TGT GAA TTT CTG CCA TTC CTG TCC 1263 
Ala Lys Glu Glu Lys Cys Glu Phe Leu Pro Phe Leu Ser 
380 385 390 

GTT ATA GTA AAA GGT GGG AGA ATT GTT GCT ATG CAG TTT 1311 
Val He Val Lys Gly Gly Arg He Val Ala Met Gin Phe 
400 405 

GAG CAA GAT GAA ACT GGA AAA TGG AAT GAA GAT GAA GAT 1359 
Glu Gin Asp Glu Thr Gly Lys Trp Asn Glu Asp Glu Asp 
415 420 

CAT CTG AAA GCC GAT GTG GTC ATC AGT GCC TTT GGT TCA 1407 
His Leu Lys Ala Asp Val Val He Ser Ala Phe Gly Ser 
430 435 440 

GAT CCT AAA GTA AAA GAA GCC TTG AGC CCT ATA AAA TTT 1455 
Asp Pro Lys Val Lys Glu Ala Leu Ser Pro He Lys Phe 
445 450 455 

GGT CTC CCA GAA GTA GAT CCA GAA ACT ATG CAA ACT AGT 15 03 

Gly Leu Pro Glu Val Asp Pro Glu Thr Met Gin Thr Ser 
460 . 465 470 

GTA TTT GCA GGT GGT GAT GTC GTT GGT TTG GCT AAC ACT 1551 

Val Phe Ala Gly Gly Asp Val Val Gly Leu Ala Asn Thr 
460 485 

TCG GTG AAT GAT GGA AAG CAA GCT TCT TGG TAC ATT CAC 1599 
Ser val Asn Asp Gly Lys Gin Ala Ser Trp Tyr He His 
495 500 

CAG TCA CAA TAT GGA GCT TCC GTT TCT GCC AAG CCT GAA 1647 
Gin Ser Gin Tyr Gly Ala Ser Val Ser Ala Lys Pro Glu 
510 515 520 

TTT TAC ACT CCT ATT GAT CTG GTG GAC ATT AGT GTA GAA 1695 
Phe Tyr Thr Pro He Asp Leu Val Asp lie Ser Val Glu 
525 530 535 

TTG AAG TTT ATA AAT CCT TTT GGT CTT GCT AGC GCA ACT 1743 
Leu Lys Phe lie Asn Pro Phe Gly Leu Ala Ser Ala Thr 
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540 545 550 

CCA GCC ACC AGC ACA TCA ATG ATT CGA AGA GCT TTT GAA GCT GGA TGG 
Pro Ala Thr Ser Thr Ser Met lie Arg Arg Ala Phe Glu Ala Gly Trp 
555 560 565 

-?3 HI r C £5° ^ ACT Trt: T ~ T 3AT AAG GAC ATT GTG ACA 

oly Phe Ala Leu Thr Lys Thr Phe Ser Leu Asp Lvs Asp He Val Thr 
570 =75 580 

AAT GTT TCC CCC AGA ATC ATC CGG GGA ACC ACC TCT GGC CCC ATG TAT 
Asn Val Ser Pro Arg lie He Arg Gly Thr Thr Ser Gly Pro Met Tyr 
585 590 595 600 

GGC CCT GGA CAA AGC TCC TTT CTG AAT ATT GAG CTC ATC AGT GAG AAA 
oly Pro Gly Gin Ser Ser Phe Leu Asn lie Glu Leu He Ser Glu Lys 
605 610 6 i5 

Th^ 2?I ^ I AT I G0 TGT CAA AGT GTC ACT GAA CTA AAG GCT GAC TTC 
Thr Ala Ala Tyr Trp Cys Gin Ser Val Thr Glu Leu Lys Ala Asp Phe 

620 625 630 

CCA GAC AAC ATT GTG ATT GCT AGC ATT ATG TGC AGT TAC AAT AAA AAT 5 nil 

Pro Asp Asn He Val n e Ala Ser He Mac Cys Ser Tyr j£J HI 2031 

635 £40 

GAC TGG ACG GAA CTT GCC AAG AAG TCT GAG GAT TCT GGA GCA GAT arc 
Asp Trp Thr Glu Leu Ala Lys Lys Ser Glu Asp Ser oly 22 25 
" u 655 ego 



1791 



1839 



1887 



1935 



1983 



2079 



2127 



2175 



2223 



2271 



flf, V ^ 7™ J CA TGT CCA m GGC ATG GGA GAA AGA GGA ATG 

Leu Glu Leu Asn Leu Ser Cys Pro His Gly Met Gly Glu Arg Gly nee 

665 670 675 680 

r?5 fI G G ? C A GT GGG 5*° GAT CCA GAG CTG GTG CGG AAC ATC TGC CGC 
Gly Leu Ala Cys Gly Gin Asp Pro Glu Leu Val Arg Asn He Cvs Aro 
685 690 6 9 5 * 

TGG GTT AGG CAA GCT GTT CAG ATT CCT TTT TTT GCC AAG CTG ACC CCA 
Trp Val Arg Gin Ala Val Gin He Pro Phe Phe Ala Lys Leu Thr Pro 
700 70S 710 

AAT GTC ACT GAT ATT GTG AGC ATC GCA AGA GCT GCA AAG GAA GGT GCT 
Asn Val Thr Asp He Val Ser lie Ala Arg Ala Ala Lys Glu Gly Glv 
7 15 720 725 

GCC AAT GGC GTT ACA GCC ACC AAC ACT GTC TCA GGT CTG ATG GGA TTA 2319 
Ala Asn Gly Val Thr Ala Thr Asn Thr Val Ser Gly Leu Met Gly Leu 
? 30 735. 740 

AAA TCT GAT GGC ACA CCT TGG CCA GCA GTG GGG ATT GCA AAG CGA ACT 2367 
Lys Ser Asp Gly Thr Pro Trp Pro Ala Val Gly He Ala Lys Arg Thr 
745 7 50 755 760 

ACA TAT GGA GGA GTG TCT GGG ACA GCA ATC AGA CCT ATT GCT TTG AGA 2415 
Thr Tyr Gly Gly Val Ser Gly Thr Ala lie Arg Pro He Ala Leu Arg 
7 65 770 775 

GCT GTG ACC TCC ATT GCT CGT GCT CTG CCT GGA TTT CCC ATT TTG GCT 2463 
Ala val Thr Ser He Ala Arg Ala Leu Pro Gly Phe Pro He Leu Ala 
780 785 790 

ACT GGT GGA ATT GAC TCT GCT GAA AGT GGT CTT CAG TTT CTC CAT AGT 2511 
Thr Gly Gly He Asp Ser Ala Glu Ser Gly Leu Gin Phe Leu His Ser 
795 800 805 

GGT GCT TCC GTC CTC CAG GTA TGC AGT GCC ATT CAG AAT CAG GAT TTC 2559 
Gly Ala Ser Val Leu Gin Val Cys Ser Ala He Gin Asn Gin Asp Phe 
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810 915 820 

ACT GTG ATC GAA GAC TAC TGC ACT GGC CTC AAA GCC CTG CTT TAT CTG 2607 
Thr Val lie Glu Asp TVr Cvs Thr Glv Leu Lys Ala Leu Leu Tyr Leu 
825 830 835 840 

AAA AGC ATT GAA GAA CTA CAA GAC TGG GAT GGA CAG AGT CCA GCT ACT 2655 
Lys Ser lie Glu Glu Leu Gin Asd Tro Asp Gly Gin Ser Pro Ala Thr 
845 * " 850 855 

GTG AGT CAC CAG AAA GGG AAA CCA GTT CCA CGT ATA GCT GAA CTC ATG 2703 
Val Ser His Gin Lys Gly Lys Pro Val Pro Arg lie Ala Glu Leu Met 
860 865 870 

GAC AAG AAA CTG CCA AGT TTT GGA CCT TAT CTG GAA CAG CGC AAG AAA 2751 
Asp Lys Lys Leu Pro Ser Phe Gly Pro Tyr Leu Glu Gin Arg Lys Lys 
875 880 885 

ATC ATA GCA GAA AAC AAG ATT AGA CTG AAA GAA CAA AAT GTA GCT TTT 2799 
lie lie Ala Glu Asn Lvs lie Arg Leu Lys Glu Gin Asn Val Ala Phe 
890 895 900 

TCA CCA CTT AAG AGA AGC TGT TTT ATC CCC AAA AGG CCT ATT CCT ACC 2847 
Ser Pro Leu Lys Arq Ser Cvs Phe lie Pro Lvs Arg Pro He Pro Thr 
905 910 915 920 

ATC AAG GAT GTA ATA GGA AAA GCA CTG CAG TAC CTT GGA ACA TTT GGT 2895 
He Lys Asp Val He Gly Lys Ala Leu Gin Tyr Leu Gly Thr Phe Gly 
925 930 935 

GAA TTG AGC AAC GTA GAG CAA GTT GTG GCT ATG ATT GAT GAA GAA ATG 2943 
Glu Leu Ser Asn Val Glu Gin Val Val Ala Met He Asp Glu Glu Met 
940 945 950 

TGT ATC AAC TGT GGT AAA TGC TAC ATG ACC TGT AAT GAT TCT GGC TAC 2991 
Cys He Asn Cys Gly Lys Cys Tyr Met Thr Cys Asn Asp Ser Gly Tyr 
955 960 965 

CAG GCT ATA CAG TTT GAT CCA GAA ACC CAC CTG CCC ACC ATA ACC GAC 3039 
Gin Ala He Gin Phe Asp Pro Glu Thr His Leu Pro Thr He Thr Asp 
970 975 980 

ACT TGT ACA GGC TGT ACT CTG TGT CTC AGT GTT TGC CCT ATT GTC GAC 3087 
Thr Cys Thr Gly Cys Thr Leu Cys Leu Ser Val Cys Pro He Val Asp 
985 990 995 1000 

TGC ATC AAA ATG GTT TCC AGG ACA ACA CCT TAT GAA CCA AAG AGA GGC 3135 
Cys He Lys Met Val Ser Arg Thr Thr Pro Tyr Glu Pro Lys Arg Gly 
1005 . 1010 1015 

GTA CCC TTA TCT GTG AAT CCG GTG TGT TAAGGTGATT TGTGAAACAG 3182 
Val Pro Leu Ser Val Asn Pro Val Cys 
1020 1025 



TTGCTGTGAA 


CTTTCATGTC 


ACCTACATAT 


GCTGATCTCT 


TAAAATCATG 


ATCCTTGTGT 


3242 


TCAGCTCTTT 


CCAAATTAAA 


ACAAATATAC 


ATTTTCTAAA 


TAAAAATATG 


TAATTTCAAA 


3302 


ATACATTTGT 


AAGTGTAAAA 


AATGTCTCAT 


GTCAATGACC 


ATTCAATTAG 


TGGCATAAAA 


3362 


TAGAATAATT 


CTTTTCTGAG 


GATAGTAGTT 


AAATAACTGT 


GTGG CAGTTA 


ATTGGATGTT 


3422 


CACTGC CAGT 


TGTCTTATGT 


GAAAAATTAA 


CTTTTTGTGT 


GGCAATTAGT 


GTGACAGTTT 


3482 


CCAAATTGCC 


CTATG CTGTG 


CTCCATATTT 


GATTTCTAAT 


TGTAAGTGAA 


ATTAAGCATT 


3542 


TTGAAACAAA 


GTACTCTTTA 


ACATACAAGA 


AAATGTATCC 


AAGGAAACAT 


TTTATCAATA 


3602 
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AAAATTACCT TTAATTTTAA TGCT3TTTCT AAGAAAATGT AG TTAGCTCC ATAAAGTACA 3662 

AATGAAGAAA GTCAAAAATT ATTTG CTATG GCA3GATAAG AAAGCCTAAA ATTGAGTTTG 3722 

TGGACTTTAT TAAGTAAAAT CCCC7TC3CT GAAATTGCTT ATTTTTGGTG TTGGATAGAG 3782 

GATAGGGAGA ATATTTACTA ACTAAATACC ATT c A CTACT Z^TGCGTGAG ATGGGTGTAC 3842 

AAACTCATCC TCTTTTAATG GCATTTCTCT TT AAA CTATG TTCCTAACCA AATGAGATGA 3902 

TAGGATAGAT CCTGGTTACC ACTCTTTTAC TGTGCACATA TGGGCCCCGG AATTC 3957 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(Ai LENGTH: 1025 amino acids 
<B) TYPE: amino acid 
•D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Pro Val Leu Ser Lys Asp Ser Ala Asd He Glu Ser He Leu 

Ala Leu Asn Pro Arg Thr Gin Thr His Ala Thr Leu Cys Ser Thr Ser 
20 25 30 

Ala Lys Lys Leu Asp Lys Lys His Trp Lys Arg Asn Pro Asp Lys Asn 
35 40 



45 



Cys Phe Asn Cys Glu Lys Leu Glu Asn Asn Phe Asp Asp He Lys His 
50 55 £Q 

Thr Thr Leu Gly Glu Arg Gly Ala Leu Arg Glu Ala Met Arg Cys Leu 
65 7 <> 75 80 

Lys Cys Ala Asp Ala Pro Cys Gin Lys Ser Cys Pro Thr Asn Leu Asp 
85 90 95 

He Lys Ser Phe He Thr Ser He Ala Asn Lys Asn Tyr Tyr Gly Ala 
100 105 no 

Ala Lys Met lie Phe Ser Asp Asn Pro Leu Gly Leu Thr Cys Gly Met 
115 120 125 

Val Cys Pro Thr Ser Asp Leu .Cys Val Gly Gly Cys Asn Leu Tyr Ala 
130 135 140 

Thr Glu Glu Gly Pro He Asn He Gly Gly Leu Gin Gin Phe Ala Thr 
145 150 155 160 

Glu Val Phe Lys Ala Met Ser lie Pro Gin He Arg Asn Pro Ser Leu 
165 170 175 

Pro Pro Pro Glu Lys Met Ser Glu Ala Tyr Ser Ala Lys He Ala Leu 
180 185 190 

Phe Gly Ala Gly Pro Ala Ser He Ser Cys Ala Ser Phe Leu Ala Arq 
195 200 205 

Leu Gly Tyr Ser Asp He Thr lie Phe Glu Lys Gin Glu Tyr Val Glv 
210 215 220 

Gly Leu Ser Thr Ser Glu lie Pro Gin Phe Arg Leu Pro Tyr Asd Val 
225 230 235 2 40 
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AAAATTACCT TTAATTTTAA TGCT3TTTCT AAGAAAATGT AGTTAGCTCC ATAAAGTACA 3662 

AATGAAGAAA GTCAAAAATT ATTTGCTATG GCA3GATAAG AAAGCCTAAA ATTGAGTTTG 3722 

TGGACTTTAT TAAGTAAAAT CCZZTTCOCT GAAATTG CTT ATTTTTGGTG TTGGATAGAG 3782 

GATAGGGAGA ATATTTACTA ACTAAATACC ATTCACTACT Z^TGCGTZAG ATGGGTGTAC 3842 

AAACTCATCC TCTTTTAATG GCATTTCTCT TTAAA CTATG TTCCTAACCA AATGAGATGA 39 02 

TAGGATAGAT CCTGGTTACC ACTCTTTTAC TGTGCACATA T3GGCCCCGG AATTC 3957 

(2) INFORMATION FOR SEQ ID NO : 2 : 

vi) SEQUENCE CHARACTERISTICS : 

\A) LENGTH: 102 5 amino acids 
(B) TYPE: amino acid 
•D) TOPOLOGY: linear 

tii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Pro Val Leu Ser Lvs Asd Ser Ala Asd He GIu Ser He Leu 
1 5 10 15 

Ala Leu Asn Pro Arg Thr Gin Thr His Ala Thr Leu Cys Ser Thr Ser 
20 25 30 

Ala Lys Lys Leu Asp Lys Lys His Trp Lys Arg Asn Pro Asp Lys Asn 
35 40 45 

Cys Phe Asn Cys Glu Lys Leu Glu Asn Asn Phe Asp Asp He Lys His 
50 55 60 

Thr Thr Leu Gly Glu Arg Gly Ala Leu Arg Glu Ala Met Arg Cys Leu 
65 70 75 80 

Lys Cys Ala Asp Ala Pro Cys Gin Lys Ser Cys Pro Thr Asn Leu Asp 
65 90 95 

He Lys Ser Phe He Thr Ser He Ala Asn Lys Asn Tyr Tyr Gly Ala 
100 105 110 

Ala Lys Met He Phe Ser Asp Asn Pro Leu Gly Leu Thr Cys Gly Met 
115 120 125 

Val Cys Pro Thr Ser Asp Leu .Cys Val Gly Gly Cys Asn Leu Tyr Ala 
130 135 140 

Thr Glu Glu Gly Pro He Asn He Gly Gly Leu Gin Gin Phe Ala Thr 
145 150 155 160 

Glu Val Phe Lys Ala Met Ser He Pro Gin He Arg Asn Pro Ser Leu 
165 170 175 

Pro Pro Pro Glu Lys Met Ser Glu Ala Tyr Ser Ala Lys He Ala Leu 
180 185 190 

Phe Gly Ala Gly Pro Ala Ser He Ser Cys Ala Ser Phe Leu Ala Arg 
195 200 205 

Leu Gly Tyr Ser Asp lie Thr lie Phe Glu Lys Gin Glu Tyr Val Gly 
210 215 220 

Gly Leu Ser Thr Ser Glu He Pro Gin Phe Arg Leu Pro Tyr Asp Val 
225 230 235 240 
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Asn lie Glu Leu lie Ser Glu Lvs Thr Ala Ala Tvr Trp Cys Gin Ser 
610 615 620 

Val Thr Glu Leu Lys Ala Asp Phe Pro Asd Asn He Val He Ala Ser 
625 630 635 640 

He Met Cys Ser Tyr Asn Lvs Asn Asd Trp Thr Glu Leu Ala Lys Lys 
6^5 650 655 

Ser Glu Asp Ser Gly Ala Asp Ala Leu Glu Leu Asn Leu Ser Cvs Pro 
660 665 670 * 

His Gly Met Gly Glu Arg Gly Met Gly Leu Ala Cys Gly Gin Asp Pro 
675 680 685 

Glu Leu val Arg Asn He Cys Arg Trp Val Arg Gin Ala Val Gin He 
690 695 700 

Pro Phe Phe Ala Lys Leu Thr Pro Asn Val Thr Asp He Val Ser He 
705 710 715 720 

Ala Arg Ala Ala Lys Glu Gly Gly Ala Asn Gly Val Thr Ala Thr Asn 
725 730 735 

Thr Val Ser Gly Leu Met Gly Leu Lvs Ser Asp Glv Thr Pro TrD Pro 
740 745 " 750 

Ala Val Gly lie Ala Lys Arg Thr Thr Tyr Gly Gly Val Ser Gly Thr 
755 760 765 

Ala He Arg Pro He Ala Leu Arg Ala Val Thr Ser He Ala Arg Ala 
770 775 780 

Leu Pro Gly Phe Pro He Leu Ala Thr Gly Gly He Asp Ser Ala Glu 
7 *5 790 795 800 

Ser Gly Leu Gin Phe Leu His Ser Gly Ala Ser Val Leu Gin Val Cys 
805 810 815 

Ser Ala He Gin Asn Gin Asp Phe Thr Val He Glu Asp Tyr Cys Thr 
820 825 830 

Gly Leu Lys Ala Leu Leu Tyr Leu Lys Ser lie Glu Glu Leu Gin Asp 
835 840 845 

Trp Asp Gly Gin Ser Pro Ala Thr Val Ser His Gin Lys Gly Lys Pro 
850 855 860 

Val Pro Arg He Ala Glu Leu .Met Asp Lys Lys Leu Pro Ser Phe Gly 
865 870 875 880 

Pro Tyr Leu Glu Gin Arg Lys Lys lie lie Ala Glu Asn Lys He Arg 
885 890 895 

Leu Lys Glu Gin Asn Val Ala Phe Ser Pro Leu Lys Arg Ser Cys Phe 
900 905 910 

lie Pro Lys Arg Pro lie Pro Thr lie Lys Asp Val lie Gly Lys Ala 
915 920 925 

Leu Gin Tyr Leu Gly Thr Phe Gly Glu Leu Ser Asn Val Glu Gin Val 
930 935 940 

Val Ala Met lie Asp Glu Glu Met Cys lie Asn Cys Gly Lys Cys Tyr 
945 950 955 960 

Met Thr Cys Asn Asp Ser Gly Tyr Gin Ala lie Gin Phe Asp Pro Glu 
965 970 975 
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Thr His Leu Pro Thr He Thr Asp Thr Cys Thr Giy Cys Thr Leu Cys 
980 985 990 

Leu Ser Val Cvs Pro He Val Asp Cys He Lys Met Val Ser Arg Thr 
5 995 " 1000 1005 

Thr Pro Tyr Glu Pro Lys Arg Gly Val Pro Leu Ser Val Asn Pro Val 
1010 1015 1 220 

10 Cys 
1025 



15 
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<2) INFORMATION FOR SEQ ID NO : 2 : 

v'i) SEQUENCE CHARACTERISTICS - 

;A) LENGTH: 4447 base cai-s 
vB) TYPE: nucleic acid" 
:C) STRANDEDNESS : sinale 
(D) TOPOLOGY: linear ~ 

(ii) MOLECULE TYPE: cDNA 



fix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 88 .. 3162 

(ix) FEATURE : 

(A) NAME /KEY : misc feature 

(B) LOCATION: 1..4447 

CD) OTHER INFORMATION: /product = "Pig DPD n 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO : 3 : 

GGACACTCGA CZCAZZCZTZ ~~ „„„ 

-_^oo n^^uAGuA v. owoGGGAGG 3CZC3CCGGT 

GGGAGACTC C AAGCT3TC0C CATCGCC ATG GCC OCT GTG CTG AGO AAG GAC 

Met Ala Pro Val Leu Ser Lys Asp 



I£J iil ^ ^ C I AT TAT GGA GCT GG T AAG ATG ATT TTT TCT GAC AAC 
Ser Asn Lys Asn Tyr Tyr Gly Ala Ala Lys Met He JhJ leT Sp j£J 

CCT CTT GGT CTG ACC TGT GGA ATG GTA TGT CCA ACC TCT GAT CTT TCT 
Pro Leu Gly Leu Thr Cys Gly Met Val Cys Prti JS 2J ™J 

12S 130 135 /* 



60 
111 

159 



?I? 22 2S ?S £S £ JS SI 25 " T S" COA "» «=»« «» 

10 ! Leu Ala Leu Asn Pro Arg Thr Gin Ser 

15 20 

SI SS 21 SI S iff US K - - - H «j 

£ % SS i£ S 55 SS i£ SS HI SI s; - ™ SS 

js is s: sj as s # as s £ s sj gs s i s 

2S 25 f S 51? JS 55 S ^ SI KS ffi S i 51 g" 
E S 51 S S SI 25 k tie ^ IS S2 ?S iS 2" 2S - 

95 100 



207 

255 
303 

351 



447 



495 



54 3 



Sit rw IP C AAT ™ TAT GCA ACT GAA GAG GGA TCA ATT AAT ATT 

Val Gly Gly Cys Asn Leu Tyr Ala Thr Glu Glu Gly 2J J™ j£n JJJ 

145 150 

m ss ns a; as si s s ss si? is s: sa je js as »* 
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CCA CAA ATC AGG AAT CCT T3T CTG CCA TCC CAA GAG AAA ATG CCT GAA 639 

Pro Gin lie Arg Asn Pro Cys Leu Pro Ser Gin Glu Lys Met Pro Glu 
170 175 180 

5 GCT TAT TCT GCA AAG ATT GCT CTT TTG GGT GCT GGG CCT GCA AGT ATA 687 

Ala Tyr Ser Ala Lys He Ala Leu Leu Gly Ala Gly Pro Ala Ser lie 
185 190 % 195 200 

AGC TGT GCT TCC TTC TTG GCT CGA TTA GGC TAC TCT GAC ATC ACT ATA 735 
lU Ser Cys Ala Ser Phe Leu Ala Arg Leu Gly Tyr Ser Asp He Thr He 

205 210 215 

TTT GAA AAA CAA GAA TAT GTT GGT GGT TTA AGT ACT TCT GAA ATC CCT 
Phe Glu Lys Gin Glu Tyr Val Gly Gly Leu Ser Thr Ser Glu He Pro 
lb 220 225 230 



20 



25 



30 



35 



55 



783 



CAG TTC CGG CTG CCA TAT GAT GTA GTG AAT TTT GAG ATT GAG CTT ATG 831 
Gin Phe Arg Leu Pro Tyr Asp Val Val Asn Phe Glu lie Glu Leu Met 
235 240 245 

AAG GAC CTT GGT GTA AAG ATA ATT TGT GGT AAA AGC CTT TCA GAG AAT 879 

Lys Asp Leu Gly Val Lys lie lie Cys Gly Lvs Ser Leu Ser Glu Asn 
250 255 " 250 

GAA ATT ACT CTC AAC ACT TTA AAA GAA GAA GGG TAT AAA GCT GCT TTC 927 
Glu He Thr Leu Asn Tr.r Leu Lys Glu Glu Gly Tvr Lys Ala Ala Phe 
265 270 275 * 280 

ATT GGT ATA GGT TTG CCA GAA CCC AAA ACG GAT GAC ATC TTC CAA GGC 975 
lie Gly lie Gly Leu Pro Glu Pro Lys Thr Asp Asp lie Phe Gin Gly 
285 290 295 

CTG ACA CAG GAC CAG GGG TTT TAC ACA TCC AAA GAC TTT CTG CCC CTT 1023 
Leu Thr Gin Asp Gin Gly Phe Tyr Thr Ser Lys Asp Phe Leu Pro Leu 
300 305 310 



GTA GCC AAA AGC AGT AAA GCA GGA ATG TGT GCC TGT CAC TCT CCA TTG 1071 

Val Ala Lys Ser Ser Lys Ala Gly Met Cys Ala Cys His Ser Pro Leu 

An 315 320 325 

CCA TCG ATA CGG GGA GCC GTG ATT GTA CTC GGA GCT GGA GAC ACA GCT 1119 

Pro Ser He Arg Gly Ala Val lie Val Leu Gly Ala Gly Asp Thr Ala 
330 335 340 

TTC GAC TGT GCA ACA TCC GCT TTA CGT TGT GGA GCC CGC CGA GTG TTC 1167 

Phe Asp Cys Ala Thr Ser Ala Leu Arg Cys Gly Ala Arg Arg Val Phe 
345 350 . 355 360 

CTC GTC TTC AGA AAA GGC TTT GTT AAT ATA AGA GCT GTC CCT GAG GAG 1215 

„ Leu Val Phe Arg Lys Gly Phe .Val Asn He Arg Ala Val Pro Glu Glu 

50 365 370 375 



GTG GAG CTT GCT AAG GAA GAA AAA TGT GAA TTT TTG CCT TTC CTG TCC 1263 
Val Glu Leu Ala Lys Glu Glu Lys Cys Glu Phe Leu Pro Phe Leu Ser 
380 385 390 

CCA CGG AAG GTT ATA GTT AAA GGT GGG AGA ATT GTT GCC GTG CAA TTT 1311 
Pro Arg Lys Val lie Val Lys Gly Gly Arg He Val Ala Val Gin Phe 
395 400 405 

60 GTT CGA ACA GAA CAA GAT GAA ACT GGA AAA TGG AAT GAA GAT GAA GAT 1359 

Val Arg Thr Glu Gin Asp Glu Thr Gly Lys Trp Asn Glu Asp Glu Asp 
410 415 420 

CAG ATA GTC CAT CTG AAG GCT GAT GTG GTC ATC AGT GCC TTT GGC TCA 1407 
65 Gin He Val His Leu Lys Ala Asp Val Val lie Ser Ala Phe Gly Ser 
425 430 435 440 

GTG CTG AGG GAT CCT AAA GTA AAA GAA GCC TTG AGC CCT ATA AAA TTT 1455 
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680 



GGC CTG GCT TGT GGG CAG GAT CCA GAG CTG GTG CGG AAC ATC TGT CGC 
Gly Leu Ala Cys Gly Gin Asp Pro Glu Leu Val Arg Asn He Cys Ara 
685 690 695 * 



1599 



1647 



1695 



Val Leu Arg Asp Pro Lys Val Lys Glu Ala Leu Ser Pro He Lys Phe 
•*4 5 -450 455 

AAC AGA TGG GAT CTC CCA GAA GTA GAT CCA GAA A~T ATG CAA — apt ncrt , 
Asn Arg Trp Asp Leu Pro Glu Val Asp Pro Gul r£ Mec «J T^r J£ 1503 
460 470 

GAA CCA TGG GTG TTT GCA GGT GGT GAT ATC GTT GGT ATG GCT AAC Arr 
Glu Pro Trp val Phe Ala Gly Gly Asp He Val G^y SI? aS £n ?£ 
* /5 4 80 485 

ACG GTG GAA TCC GTA AAT GAC GGA AAG CAG GCC TCC TGG TAC ATT rar 
Thr Val Glu Ser Val Asn Asp Gly Lys Gin Ala S^r t£ ™? jYe %£ 

495 500 

AAA TAT ATC CAG GCC CAA TAT GGA GCT TCA GTT TCT GCC AAG CT 
Lys Tyr lie Gin Ala Gin Tyr Gly Ala Ser Val sS Sa £1 g?J 

515 5 20 

CTG CCC CTG TTT TAT ACG CCT GTT GAC CTG GTG GAC ATC AGC ~- r 
Leu Pro Leu Phe Tyr Thr Pro Val Asp Leu Val £p tie ter Va? SJ 

530 535 

ATG GCT GGA TTA AAG TTT ATA AAT — ~~ - — . ™~ _ ^ „ 

Met Ala Gly Leu Lys *re "~ ten ohL "* ^ A ° A GCA GCT x74 ^ 

ql-J ° Phe ^ eu Ala Ala Ala 

545 550 

0°* A S T AGT TCA TCG ATG ATT CGA AGA GCT TTT GAA GCT GGA TGG 

Pro Thr Thr Ser Ser Ser Met lie Arg Arg Ala p£ 22 ?S 

3 560 ^ 

GGT TTT GCC CTG ACC AAA ACT TTC TCT CTT GAT AAG GAC ATA C-r »n 
Gly Phe Ala Leu Thr Lys Thr Phe Ser Leu 2J JyJ Sp v a ? ?£ 

575 sao 

AAT GTC TCA CCC AGA ATC GTC CGG GGG ACT ACC TCT GGC CCC ATG TAC 
Asn val Ser Pro Arg lie Val Arg Gly Thr Thr Ser Sly Pro £r 

595 6oo 

GGC CCT GGA CAA AGC TCC TTC CTG AAT ATT GAG CTC ATP apt ran 
Gly Pro Gly Gin Ser Ser Phe Leu Asn nj 2S ?5 J£ £J 

60S 510 615 y 

ACA GCT GCA TAT TGG TGT CAA AGT GTC ACT GAA CTA AAA GCT GAC TTT 
Thr Ala Ala Tyr Trp Cys Gin Ser Val Thr gT? ™ J£ Ala Sp SS 

CCA GAC AAT ATT GTG ATC GCC AGC ATC ATG TGT AGT TAC AAC AAA AAT 
Pro Asp Asn lie Val lie Ala .Ser He Met Cys Ser Tyr JJJ J£ 

635 640 645 

GAC TGG ATG GAA CTC TCC AGA AAG GCT GAG GCC TCT GGA GCA GAT GCC 2079 
Asp Trp Mec Glu Leu Ser Arg Lys Ala Glu Ala Ser Gly Ala Asp Ala 
650 655 660 

TTG GAG TTA AAT CTG TCA TGT CCA CAC GGC ATG GGA GAA AGA GGA ATG 2127 
Leu Glu Leu Asn Leu Ser Cys Pro His Gly Mec Gly Glu Arg Gly Met 

fc»5 670 575 



1791 



1839 



1887 



1935 



1983 



2031 



2175 



2223 



TGG GTT AGG CAA GCT GTT CAG ATT CCC TTT TTT GCC AAG TTG ACC CCA 
Trp Val Arg Gin Ala Val Gin lie Pro Phe Phe Ala Lys Leu Thr Pro 
700 70S 710 

AAC GTC ACT GAT ATA GTA AGC ATC GCC AGA GCG GCC AAG GAA GGT GGC 2271 
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60 
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Asn Val Thr Asp lie val Ser He Ala Arg Ala Ala Lvs Glu Gly Gly 

/lb 72 0 — ~ - 1 



725 



GCA GAT GGT GTT ACA GCC ACC AAC ACG GTC TCA GGT CTC ATG GGA TTA 
5 Ala Asp Gly Val Thr Ala Thr Asn Thr val Ser Gly 2u HeZ 2i 
• }U 735 740 

AAA GCC GAT GGC ACG CCC TGG CCA GCG GTG GGT GCT GGC AAG CGG ACT 
10 Ala ASP Gly Thr P " Tr P Pro Ala Gly Ala Gly Lys Zrg ?5 

50 755 760 

I AC 2°* GGA GTG TCT GGC ACG GCC ATC AGA CCA ATT GCT TTG AGA 
Thr Tyr Gly Gly Val Ser Gly Thr Ala lie Arg Pro He Ala Leu Arc 
15 765 770 775 



f^ST G ? T ^ A " GAC TCA GCT GAA AGT GGA CTT CAG TTT CTC CAC AGT 
Thr Gly Gly He Asp Ser Ala Glu Ser Gly Leu Gin piU 25 hY5 SJJ 

" 800 



31y Ala s.r »„ ,e U oij ^7 c„ S£ S. 5ii oK JE SS SJ 55 

815 620 

T*£ SI? ?TS ^ 2 AC I* T TGC ACT GGC CTC ^ GGG TTG CTT TAT CTG 
Thr val lie Gin Asp Tyr Cys Thr Gly Leu Lys Ala Leu Leu Tyr Leu 

830 835 840 



GGA AAG AAA CTG CCA AAT TTT GGA CCT TAT CTG GAG CAA CGC AAG AAA 
Gly Lys Lys Leu Pro Asn Phe Gly Pro Tyr Leu Glu Gin Arg LyJ J£ 



2319 



2367 



2415 



SI? Thr £S ?P ? GT G F m CCT GGA m CCC TTG GCT 2463 

Ala Val Thr Thr He Ala Arg Ala Leu Pro Gly Phe Pro He Leu Ala 

780 785 790 



2511 



25 S3 SI S? SI? 25 SfS Sit 215 AG I GC : RTF £ AG aat cag gat ttc 



2607 



2655 



AAA AGC ATT GAA GAA CTA CAA GGC TGG GAT GGG CAG AGT CCA GGT ACC 
Lys Ser He Glu Glu Leu Gin Gly Trp Asp Gly Gin Ser SS SJ ?S 
35 845 850 855 

GAG AGT CAC CAG AAG GGG AAA CCA GTT CCT CGT ATT GCT GAA CTC ATG 2703 
Glu Ser His Gin Lys Gly Lys Pro Val Pro Arg He Ala Glu Leu SI? 3 
860 865 870 



2751 



45 iT C tT A ^ G ^ f*® ATG AGA CTG AAA GAA CAA AAT GCA GCT TTT 2799 

I.o LyS - et Arg Leu Lys Glu Gln Asn Ala Ala Phe 

ayfJ 895 900 



CCA CCA CTT GAG AGA AAA CCT TTT ATT CCC AAA AAG CCT ATT CCT GCT 

50 *Zl G1U Ar9 oYn Pr °' Phe 116 Pr ° ^ ^ P »> All 

W *° 5 910 915 920 

A TT fAG GAT GTA ATT GGA AAA GCA CTG CAG TAC CTT GGA ACG TTT GGT 
He Lys Asp Val He Gly Lys Ala Leu Gin Tyr Leu Gly Thr Phe Gly- 
55 925 930 935 

GAA CTG AGC AAC ATA GAG CAA GTT GTG GCT GTG ATC GAT GAA GAA ATG 
Glu Leu Ser Asn lie Glu Gin Val Val Ala Val He Asp Glu Glu Met 
940 945 950 

TGT ATC AAC TGT GGC AAA TGC TAC ATG ACC TGT AAT GAC TCT GGC TAC 
Cys He Asn Cys Gly Lys Cys Tyr Met Thr Cys Asn Asp Ser Gly Tyr 
955 960 965 

cc CA* 3 GCT ATC CAG TTT GAT CCC GAA ACC CAC CTG CCC ACC GTT ACT GAC 3039 

65 Gin Ala He Gin Phe Asp Pro Glu Thr His Leu Pro Thr Val Thr Asp 
970 975 980 

ACT TGC ACA GGC TGT ACC CTG TGT CTC TCC GTC TGC CCT ATT ATC GAC 3087 



2847 



2895 



2943 



2991 
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Thr Cys Thr Gly Cys T£r Leu Cys Leu Ser val Cys Pre He He Asp 
* 85 990 995 10 00 



TGC ATC AGA ATG CTT TCC AGG ACA ACA CCT TAC GAA C~A AAG AGA err r»- 

Cys lie Arg Met Val Ser Arg Thr Thr Pro Tyr oTZ Pro* JJ2 JrJ g5v 3 " 5 
1005 1010 10 £ 5 y 

TTG CCC TTG GCT GTG AAT CCG GTG TGC TGAGGTGATT CGTGGAACAG , 1fl , 

Leu Pro Leu Ala Val Asn Pro Val Cys 3182 



1020 1025 
TTG CTGTGAA CTTTGAGGTC ACCCCCATAT GCTGTCTTTT TAATTGTGGT TATTATACTC 
AGCTCTTTCT CAATGAAAAC AAATATAATA TTTCTAGATA AAAGTTCTAA ATACATGTCT 
AAATTTTAAA AAACATCTAC TGCCAGAGCC CGTTCAATTA ATGGTCATAA AATAGAATCC 
TGCTTTTCTG AGGCTAGTTG TTCAATAACT GCTGCAGTTA A TTG G ATG TT CTCCATCAGT 
TATCCATTAT GAAAAATATT AACTTTTTTG GTGGCAATTT CCAAATTGCC CTATGCTGTG 
CTCTGTCTTT GATTTCTAAT TGTAAGTGAA GTTAAGCATT TTAGAACAAA GTATAATTTA 
ACTTTCAAGC AAATGTTTCC AAGGAAACAT TTTATAATTA AAAATTACAA TTTAATTTTA 
ACACTGTTCC TAAG CAAATG TAATTAGCTC CATAAAGCTC AAATGAAGTC AAATAATTAT 
TTACTGTGGC AGGAAAAGAA AGCCAATGAG GGTTTGCAAA ACTTCTCTAA GGCCCTTTGG 
CTGAAATAAC TTCTCTTTGG TG CTA CAT A C T 3 AAAGTGA C TGTTTAATCA TCATTCATGT 
CACACCGTGC TCCCTCGCCC TCAGGCCTGA GATGGGTCTC CAGACTCCAC CAGTGAATCA 
GCATGACACC TTCTTTAACT GTGTGAGCGA CGTTCCTAAC AAAGTAAGGT GTGGGGATGA 
AGCTCTGGTT AAAGCCACTC TTTTGCTGTG CTCCGATCTG TTCTATCCG C TTCTGAGAGC 
AACCTTCATG ATTACAGCAA TTAATGTTTG CACAGAGCCC AGATTATACA GCAGTGGGTC 
ATTGTGCTTC ATTATTCAAG AATGAAGATA AAGACAAATA GAGGATTAGT AAAATATATT 
AAATGTG CAA TACCA CTTAA ATGACTCTTA ATGTTTATAT TGAATTTCCA AAG C GATTAA 
ATAAAAAAGA GCTATTTTTT GTTATTGCCA AACAATATTT TTTGTATTTC TCTATTTTCA 4202 
TAATGAG CAA ATAGCATCCT ATAAATCTGT TTATCTCTTC TTTGTAGTGT GTTTTCATAT 4262 
AAATCCACAA GTAGAAAATC TTTTCATCTG TGGCATATTT CTATGACAAA TG CAAGATCT 
AGAAAAATTA AATGTTTGAT TATGCCATTT TGGAAATGCA TATTTACCAC CAAACCTATG 
TGACTGAATA ATGTCAAATA AAATTTTATG AATCATTTTA AAAAAAAAAA AAAAAGGGCG 
GCCGC 



3242 
3302 
3362 
3422 
3482 
3542 
3602 
3662 
3722 
3782 
3842 
3902 
3962 
4022 
4082 
4142 
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i2) INFORMATION FOR SZZ 10 NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

AJ LENGTH: I C 2 5 amine acids 
:3) TYPE: ammo acid 
;D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 



Met Ala Pro Val Leu Ser Lys Asp val Ala Asp He Glu Ser He Leu 

5 10 15 

Ala Leu Asn Pro Arg Thr Gin Ser His Ala Ala Leu His Ser Thr Leu 
*0 25 30 

Ala Lys Lys Leu Asp Lys Lys His Trp Lys Arg Asn Pro Asp Lys Asn 
J:> 40 45 

Cys Phe His Cys Glu Lys Leu Glu Asn Asn Phe Gly Asp lie Lys His 

Thr Thr Leu Gly Glu Arg Gly Ala Leu Arg Glu Ala Mec Arg Cys Leu 
65 70 75 y 80 

Lys Cys Ala Asp Ala Pro Cys Gin Lys Ser Cys Pro Thr His Leu Asd 
35 90 95 y 

He Lys Ser Phe He Thr Ser He Ser Asn Lys Asn Tyr Tyr Gly Ala 
100 105 iio r 

Ala Lys Mec lie Phe Ser Asp Asn Pro Leu Gly Leu Thr Cys Gly Met 

120 125 

Val Cys Pro Thr Ser Asp Leu Cys Val Gly Gly Cys Asn Leu Tyr Ala 
■ LJU 135 140 

Thr Glu Glu Gly Ser lie Asn He Gly Gly Leu Gin Gin Phe Ala Ser 
145 150 155 160 

Glu Val Phe Lys Ala Mec Asn He Pro Gin He Arg Asn Pro Cys Leu 
lfi 5 170 175 

Pro Ser Gin Glu Lys Mec Pro Glu Ala Tyr Ser Ala Lys He Ala Leu 
180 - 185 190 

Leu Gly Ala Gly Pro Ala Ser He Ser Cys Ala Ser Phe Leu Ala Arg 
195 .200 205 

Leu Gly Tyr Ser Asp He Thr He Phe Glu Lys Gin Glu Tyr Val Gly 
210 215 220 

Gly Leu Ser Thr Ser Glu He Pro Gin Phe Arg Leu Pro Tyr Asp Val 



235 240 



Val Asn Phe Glu He Glu Leu Mec Lys Asp Leu Gly Val Lys He lie 
245 250 255 

Cys Gly Lys Ser Leu Ser Glu Asn Glu lie Thr Leu Asn Thr Leu Lvs 
2 «0 265 270 

Glu Glu Gly Tyr Lys Ala Ala Phe He Gly He Gly Leu Pro Glu Pro 
275 280 285 

Lys Thr Asp Asp He Phe Gin Gly Leu Thr Gin Asp Gin Gly Phe Tvr 
290 295 300 
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Thr Ser Lys Asp Phe Leu Pro Leu Va 1 Ala Lvs Ser Ser Lvs Ala Glv 
305 jig 3Z5 " 320 

Mec Cys Ala Cys His Ser Pro Leu Pro Ser He Arg Gly Ala Val He 
325 330 335 

Val Leu Gly Ala Gly Asp Thr Ala Phe Asp Cvs Ala Thr Ser Ala Leu 
340 345 350 

Arg Cys Gly Ala Arg Arg Val Phe Leu Val Phe Arg Lys Gly Phe Val 
355 360 365 

Asn He Arg Ala Val Pro Glu Glu Val Glu Leu Ala Lys Glu Glu Lvs 
370 375 380 

Cys Glu Phe Leu Pro Phe Leu Ser Pro Arg Lys Val He Val Lys Gly 
385 390 395 4 00 

Gly Arg lie Val Ala Val Gin Phe Val Arg Thr Glu Gin Asp Glu Thr 
405 410 415 

Gly Lys Trp Asn Glu Asp Glu Asp Gin He Val His Leu Lys Ala Asp 
420 425 430 

Val Val He Ser Ala Phe Giv Ser Val Leu Aro Asd Pro Lys Val Lys 
435 440 " " 445 

Glu Ala Leu Ser Pro lie Lys Phe Asn Arg Tro Asp Leu Pro Glu Val 
450 455 " 460 

Asp Pro Glu Thr Mec Gin Thr Ser Glu Pro Tm Val Phe Ala Gly Gly 
465 470 475 480 

Asp He Val Gly Met Ala Asn Thr Thr Val Glu Ser Val Asn Asp Gly 
485 490 495 

Lys Gin Ala Ser Trp Tyr He His Lys Tyr He Gin Ala Gin Tyr Gly 
500 505 510 

Ala Ser Val Ser Ala Lys Pro Glu Leu Pro Leu Phe Tyr Thr Pro Val 
515 520 525 

Asp Leu Val Asp He Ser Val Glu Met Ala Gly Leu Lys Phe He Asn 
530 535 540 

Pro Phe Gly Leu Ala Ser Ala Ala Pro Thr Thr Ser Ser Ser Mec He 
545 550 555 560 

Arg Arg Ala Phe Glu Ala Gly .Trp Gly Phe Ala Leu Thr Lys Thr Phe 
565 570 575 

Ser Leu Asp Lys Asp He Val Thr Asn Val Ser Pro Arg He Val Arg 
580 585 590 

Gly Thr Thr Ser Gly Pro Mec Tyr Gly Pro Gly Gin Ser Ser Phe Leu 
595 600 605 

Asn He Glu Leu He Ser Glu Lys Thr Ala Ala Tyr Trp Cys Gin Ser 
610 615 620 

Val Thr Glu Leu Lys Ala Asp Phe Pro Asp Asn He Val He Ala Ser 
625 630 635 640 

He Mec Cys Ser Tyr Asn Lys Asn Asp Trp Mec Glu Leu Ser Arg Lys 
645 650 655 

Ala Glu Ala Ser Gly Ala Asp Ala Leu Glu Leu Asn Leu Ser Cys Pro 
660 665 670 
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His Gly Met Gly Glu Arg Gly Met GIv Leu Ala Cys Gly Gin Asp Pro 
675 680 685 

Glu Leu Val Arg Asn He Cys Arg Trp Val Arg Gin Ala Val Gin He 
690 695 700 

Pro Phe Phe Ala Lys Leu Thr Pro Asn Val Thr Asp He Val Ser He 
705 710 715 720 

Ala Arg Ala Ala Lys Glu Gly Gly Ala Asp Gly Val Thr Ala Thr Asn 
725 730 735 

Thr val Ser Gly Leu Met Gly Leu Lys Ala Asp Gly Thr Pro Trp Pro 
740 745 750 

Ala Val Gly Ala Gly Lys Arg Thr Thr Tyr Gly Gly Val Ser Gly Thr 
755 760 765 

Ala lie Arg Pro He Ala Leu Arg Ala Val Thr Thr lie Ala Arg Ala 
770 775 780 

Leu Pro Gly Phe Pro He Leu Ala Thr Gly Gly lie Asp Ser Ala Glu 
785 790 795 800 

Ser Gly Leu Gin Phe Leu His Ser Gly Ala Ser Val Leu Gin Val Cys 
805 810 815 

Ser Ala Val Gin Asn Gin Asp Phe Thr Val lie Gin Asp Tyr Cys Thr 
820 825 830 

Gly Leu Lys Ala Leu Leu Tyr Leu Lys Ser lie Glu Glu Leu Gin Gly 
835 840 845 

Trp Asp Gly Gin Ser Pro Gly Thr Glu Ser His Gin Lys Gly Lys Pro 
850 855 860 

Val Pro Arg lie Ala Glu Leu Met Gly Lys Lys Leu Pro Asn Phe Gly 
965 870 875 880 

Pro Tyr Leu Glu Gin Arg Lys Lys He He Ala Glu Glu Lys Met Arg 
885 890 895 

Leu Lys Glu Gin Asn Ala Ala Phe Pro Pro Leu Glu Arg Lys Pro Phe 
900 905 910 

He Pro Lys Lys Pro lie Pro Ala He Lys Asp Val lie Gly Lys Ala 
915 920 925 

Leu Gin Tyr Leu Gly Thr Phe .Gly Glu Leu Ser Asn He Glu Gin Val 
930 935 940 

Val Ala Val He Asp Glu Glu Met Cys lie Asn Cys Gly Lys Cys Tyr 
945 950 955 960 

Met Thr Cys Asn Asp Ser Gly Tyr Gin Ala lie Gin Phe Asp Pro Glu 
965 970 975 

Thr His Leu Pro Thr Val Thr Asp Thr Cys Thr Gly Cys Thr Leu Cys 
980 985 990 

Leu Ser Val Cys Pro lie lie Asp Cys lie Arg Met Val Ser Arg Thr 
995 1000 1005 

Thr Pro Tyr Glu Pro Lys Arg Gly Leu Pro L u Ala Val Asn Pro Val 
1010 1015 1020 

Cys 
1025 
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(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (primer) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
GCAAGGAGGG TTTGTCACTG 

20 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
iB) TYPE: nucleic acic 
(C) STRANDEDNESS: single 
(DJ TOPOLOGY • linear 

(ii) MOLECULE TYPE: DNA (primer) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
CCGATTCCAC TGTAGTGTTA GCC 
(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (primer) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
TAA CACTACA GTGGAATCGG 
(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (primer) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
AAATCCAGGC AGAGCACGAG 
(2) INFORMATION FOR SEQ ID NO: 9: 
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■:i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 24 base pairs 

( B) TYPE: nucleic acid 
(CI STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (primer) 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
TGCTCGTGCT CTGCCTGGAT TTCC 24 
15 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 4 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



30 



40 



•ii) MOLECULE TYPE : DMA ( primer ) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ATTGAATGGT CATTGACATG AGAC 24 
(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



Cys Xaa Xaa Cys Xaa Xaa Cys Xaa Xaa Xaa Cys Xaa 
45 i 5 io 

(2) INFORMATION FOR SEQ ID NO-12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 ami/io acids 
50 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

60 Cys Xaa Xaa Cys Xaa Xaa Cys Xaa Xaa Xaa Cys Pro 

15 10 

(2) INFORMATION FOR SEQ ID N0:13: 

65 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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-.0) TOPOLOGY: linear 
MOLECULE TYPE: peptide 

5 

fxi) SEQUENCE DESCRIPTION: SEC ID NO: 12 
1Q val xaa val Xaa Gly Xaa Gly Xaa Xaa Gly Xaa Xaa Xaa Aia Xaa Xaa 

Ala 



10 15 
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WHAT IS CLAIMED IS : 

1 1 . An isolated nucleic acid encoding a dihydropyrimidine 

2 dehydrogenase (DPD) protein, said nucleic acid capable of selectively hybridizing to 

3 a second nucleic acid consisting of the nucleotide sequence of Seq. ID. No. 1 or 

4 Seq. ID No. 3 under stringent hybridization conditions. 

1 2. The nucleic acid of claim 1 wherein the nucleic acid is of 

2 human origin. 

1 3. The nucleic acid of claim 2 wherein the nucleic acid consists of 

2 the nucleotide sequence of Seq. ID. No. 1 . 

1 4. The nucleic acid of claim 1 wherein the nucleic acid is of pig 

2 origin. 

1 5. The nucleic acid of claim 4 wherein the nucleic acid consists of 

2 the nucleotide sequence of Seq. ID. No. 3. 

1 6. The nucleic acid of claim 1 wherein the nucleic acid is full- 

2 length. 

1 7. An isolated nucleic acid that codes for a DPD polypeptide, 

2 wherein a polypeptide expressed from the nucleic acid specifically binds to an 

3 antibody generated against an immunogen consisting of a DPD polypeptide having 

4 an amino acid sequence as depicted by Seq. ID No. 2 or Seq. ID No. 4. 

1 8. The nucleic acid of claim 7 wherein the nucleic acid is of 

2 human origin. 

1 9. The nucleic acid of claim 8 wherein said nucleic acid consists 

2 of the polynucleotide sequence of Seq. ID. No. 1 . 

1 10. The nucleic acid of claim 7 wher in said nucleic acid is of pig 

2 origin. 
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1 11 . The nucleic acid of claim 10 wherein said nucleic acid consists 

2 of the polynucleotide sequence of Seq. ID No. 3. 

1 1 2. The nucleic acid of claim 7 wherein said nucleic acid is full* 

2 length. 

1 1 3. An oligonucleotide probe that is capable of selectively 

2 hybridizing, under stringent hybridizing conditions, to a DPD nucleic acid having a 

3 nucleotide sequence of Seq. ID No. 1 or Seq. ID No. 3. 

1 14. An oligonucleotide probe of claim 13 that is between about 10 

2 and 100 nucleotides in length. 

1 15. A method for determining whether a patient is at risk of a 

2 toxic reaction to 5-fluorouracil, the method comprising analyzing DPD DNA or 

3 mRNA in a sample from the patient to determine the amount of intact DPD nucleic 

4 acid, wherein an enhanced risk of a toxic reaction to 5-fluorouracil is indicated by a 

5 decrease in the amount of intact DPD DNA or mRNA in the sample compared to the 

6 amount of DPD DNA or mRNA in a sample obtained from a patient known to not 

7 have a DPD deficiency. 

1 1 6. A method of claim 1 5 wherein an enhanced risk of a toxic 

2 reaction is indicated by a decrease of greater than about 70%. 

1 1 7. A method of claim 1 5 wherein an increased risk of a toxic 

2 reaction is indicated by a decrease of greater than about 50%. 

1 1 8. The method of claim 1 5, wherein the method comprises the 

2 steps of: 

3 (a) obtaining a cellular sample from the patient; 

4 (b) extracting DNA or RNA from the sample; 

5 < c > hybridizing a probe comprising a DPD nucleic acid to the 

6 DNA or RNA from the sample; and 
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7 (d) determining whether the DPD nucleic acid binds to the 

8 DNA or RNA. 

1 19. The method of claim 1 5, wherein the DPD nucleic acid is 

2 analyzed by RT-PCR. 

1 20. The method of claim 1 5, wherein the DPD nucleic acid is 

2 analyzed by PCR sequencing of genomic DNA from the patient. 

1 21 . A method of claim 1 5 wherein the cellular sample comprises 

2 lymphocytes. 

1 22. A method of claim 1 5 wherein the probe oligonucleotide probe 

2 that is capable of selectively hybridizing , under stringent hybridizing conditions, to a 

3 DPD nucleic acid having a nucleotide sequence or a specific subsequence of that 

4 shown in Seq. ID No. 1 or Seq. ID No. 3. 

1 23. A method of claim 22 wherein the oligonucleotide probe is 

2 between about 10 and 100 nucleotides in length. 

1 24. A method for expressing recombinant DPD protein in a 

2 prokaryotic cell, the method comprising the steps of: 

3 a) transfecting the cell with an expression vector 

4 comprising a promoter that is operably linked to a nucleic acid that encodes DPD; 

5 and 

6 b) incubating the cell in a medium that contains uracil to 

7 allow expression of the recombinant DPD protein. 

1 25. A method of claim 24 wherein the medium contains about 100 

2 //M uracil. 

1 26. A method of claim 24 wherein the medium contains 100//M 

2 each of FAD and FMN. 
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1 27. An expression vector comprising a selectable marker, wherein 

2 the selectable marker is a nucleic acid that encodes DPD. 

1 28. An expression vector as in claim 27 wherein the selectable 

2 marker is operably linked to at least one promoter. 

1 29. An expression vector as in claim 28 wherein the promoter 

2 functions in a eukaryote. 

1 30. An expression vector as in claim 28 wherein the promoter 

2 functions in a prokaryote. 

1 31 . An expression vector as in claim 28 wherein the selectabl 

2 marker is operably linked to both a prokaryotic and a eukaryotic promoter. 
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